Improved Matching Algorithms in Q
Table of Contents
 Meetup Details
 Motivation
 Simple Example
 Dictionary Literal Digression
 Engagement Demonstration
 Roommate Demonstration
 Stable Marriage (SM) Problem
 Stable Roommates (SR) Problem
 HospitalResident (HR) Problem
 StudentAllocation (SA) Problem
 Performance
 Q Enhancements
 Summary
Meetup Details
Motivation
 Interndesk pairing
 Two distinct populations
 Interns rank desks
 Desks rank interns
 Prevent backroom deals
 Is there a single optimal pairing?
Simple Example
 Use Kdb+ 4.1 dictionary literal syntax to build suitor and reviewer preferences as well as engagement dictionary
q)show S:([A:`D`E`F;B:`D`F`E;C:`F`D`E]) / suitor preferences
A D E F
B D F E
C F D E
q)show R:([D:`B`C`A;E:`A`C`B;F:`C`B`A]) / reviewer preferences
D B C A
E A C B
F C B A
q)show E:([A:`E;B:`D;C:`F]) / engagements
A E
B D
C F
 Confirm algorithm produces optimal engagements
q)E~first .matching.sm[S;R]
1b
Dictionary Literal Digression

Intuitively extends the table and list syntax
q)([a:1;b:2]) a 1 b 2

Keys are always symbols and there is no zerocolumn table
q)0N!([]) (`symbol$())!()

There is now an elegant way to create a single element dictionary!
q)enlist[`k]!enlist`v k v q)([k:`v]) k v
Engagement Demonstration
 Two sets of index cards
 Qgods: 09
 Qbies: AJ
 Pairoff so everyone is ‘happy’
 How did we do?
Qgods and Qbies
q)show g:.Q.n / define qgods
"0123456789"
q)show b:10#.Q.A / define qbies
"ABCDEFGHIJ"
Qgod and Qbie Preferences
q)\S 314159i / reset seed
q)show G:g!10?/:10#enlist b / define qgod prefs
0 "ICEBGAFDHJ"
1 "GHDCJBAIEF"
2 "HBDJFAEGIC"
3 "BIFHADGECJ"
4 "EDICJFHABG"
5 "BEHGACFDIJ"
6 "AGCIHEJBDF"
7 "GBHDEIFACJ"
8 "JGEBHADICF"
9 "FIGJBACHED"
q)show B:b!10?/:10#enlist g / define qbie prefs
A "9378126054"
B "8014967352"
C "7284539016"
D "3217965084"
E "7051684329"
F "6943720851"
G "4501672893"
H "7681405392"
I "7458106239"
J "9154860327"
QgodOptimal Matches
 The suitor obtains the best possible match among all stable matches
q)geSR:.matching.sm[G;B] / perform qgodoptimal match
q)geSR 0 / engagements
0 C
1 G
2 D
3 H
4 I
5 E
6 A
7 B
8 J
9 F
q)geSR 1 / remaining qgod prefs
0 "CEBGH"
1 "GHJBA"
2 "DAC"
3 "HADC"
4 "ICJHBG"
5 "EHGCJ"
6 "AHBF"
7 "BHEIAC"
8 "JBHAC"
9 "FJBAC"
Qbie Matches
 The reviewer obtains the worst possible match among all stable matches
q)geSR 0 / engagements
0 C
1 G
2 D
3 H
4 I
5 E
6 A
7 B
8 J
9 F
q)geSR 2 / remaining qbie prefs
A "9378126"
B "8014967"
C "72845390"
D "32"
E "705"
F "69"
G "4501"
H "76814053"
I "74"
J "91548"
QbieOptimal Matches
 Allowing Qbies to propose first improves their matches
 Matches are still stable
q)beSR:.matching.sm[B;G] / perform qbieoptimal match
q)beSR 0 / engagements
A 3
B 8
C 0
D 2
E 5
F 6
G 1
H 7
I 4
J 9
Uniqueness
 The Qgodoptimal and Qbieoptimal engagements are not always the same
 Use Kdb+ 4.1 pattern matching to invert the Qbie dictionary
q)geSR[0],'geSR[0] = {[k!v]v!k} beSR 0
0 "C" 1b
1 "G" 1b
2 "D" 1b
3 "H" 0b
4 "I" 1b
5 "E" 1b
6 "A" 0b
7 "B" 0b
8 "J" 0b
9 "F" 0b
Roommate Demonstration
 One set of index cards
 Qroomies: AJ
 Pairoff so everyone is ‘happy’
 How did we do?
Qroomie Preferences
 Roommates rank everyone else but themselves
q)show R:b!value[G] except' b / define qroomie prefs
A "ICEBGFDHJ"
B "GHDCJAIEF"
C "HBDJFAEGI"
D "BIFHAGECJ"
E "DICJFHABG"
F "BEHGACDIJ"
G "ACIHEJBDF"
H "GBDEIFACJ"
I "JGEBHADCF"
J "FIGBACHED"
Qroomie Matches
 The first stage of the algorithm can produce multiple solutions
 The second stage of the algorithm removes ‘cycles’
q)show reR:.matching.sr R
"ABCDEFGHIJ"!"GHFEDCABJI"
"ABCDEFGHIJ"!("CG";"HD";"FA";"BE";"DF";"EC";"AH";"GB";,"J";,"I")
"ABCDEFGHIJ"!(,"G";,"H";,"F";,"E";,"D";,"C";,"A";,"B";,"J";,"I")
q)reR 1
A "CG"
B "HD"
C "FA"
D "BE"
E "DF"
F "EC"
G "AH"
H "GB"
I ,"J"
J ,"I"
Stable Marriage (SM) Problem
Given two distinct populations how do you create matches such that no pair prefers each other over their current matching?
Stable Marriage Algorithm
The GaleShapley 1962 (Deferred Acceptance) algorithm:
 All participants rank partners
 Iteratively engage each suitor:
 Return early if every suitor is engaged
 Find preferred reviewer for next single suitor
 If reviewer is single, they accepted suitor
 Else, allow reviewer to renege and upgrade – old suitor gets to try again
 Return updated engagement vector, and suitor and reviewer preference vectors
Stable Marriage Theorems
 The algorithm completes in a finite number of steps
 The algorithm terminates in at most \(n^2  n + 1\) iterations
 The algorithm always produces stable matches
 The matches are always suitoroptimal (and reviewerpessimal)
 The matches are unique if suitoroptimal and revieweroptimal results are identical
Enumerating Preference Maps
 Humans prefer names
 Algorithms prefer indices
 We convert ranking dictionaries to 0based lists by enumerating
each value with the
?
find operator
q)G / qgod prefs
0 "ICEBGAFDHJ"
1 "GHDCJBAIEF"
2 "HBDJFAEGIC"
3 "BIFHADGECJ"
4 "EDICJFHABG"
5 "BEHGACFDIJ"
6 "AGCIHEJBDF"
7 "GBHDEIFACJ"
8 "JGEBHADICF"
9 "FIGJBACHED"
q)key B / qbie enumeration vector
"ABCDEFGHIJ"
q)show S:key[B]?value G / qgod enumerations
8 2 4 1 6 0 5 3 7 9
6 7 3 2 9 1 0 8 4 5
7 1 3 9 5 0 4 6 8 2
1 8 5 7 0 3 6 4 2 9
4 3 8 2 9 5 7 0 1 6
1 4 7 6 0 2 5 3 8 9
0 6 2 8 7 4 9 1 3 5
6 1 7 3 4 8 5 0 2 9
9 6 4 1 7 0 3 8 2 5
5 8 6 9 1 0 2 7 4 3
q)R:key[G]?value B / gbie enumerations
Stable Marriage Wrapper
 Use Kdb+ 4.1 pattern matching to unpack dictionary parameters
 Enumerate the suitor and reviewer dictionaries
 Build allnull engagement vector
 Iterate with
.matching.sma
until convergence  Convert enumerations back to dictionaries
/ given (S)uitor and (R)eviewer preferences, return the (e)ngagement
/ dictionary and remaining (S)uitor and (R)eviewer preferences for inspection
sm:{[sn!sp;rn!rp]
eSR:(count[sn]#0N;rn?sp;sn?rp); / initial state/enumerated values
eSR:sma over eSR; / iteratively apply GaleShapley algorithm
eSR:(sn;sn;rn)!'(rn;rn;sn)@'eSR; / map enumerations back to original values
eSR}
Stable Marriage Implementation
 Manually unpack arguments because
SR
may be a single matrix or a pair of matrices  The suitor and reviewer indices –
Si
andRi
respectively – are defined as variables so that the same function can be used for the Stable Roommates (SR) Problem  Use Kdb+ 4.1 pattern matching to unpack
.matching.prune
results
/ given (e)ngagement vector and (S)uitor and (R)eviewer preferences, find
/ next engagement, remove undesirable suitors and unavailable reviewers.
/ roommate preferences are assumed if (R)eviewer preferences are not
/ provided.
sma:{[eSR]
e:eSR 0;S:eSR Si:1;R:eSR Ri:1+count eSR; / manually unpack
mi:?[;1b] 0<count each S w:where null e; / first unmatched with prefs
if[mi=count w;:eSR]; / no unmatched suitor
rp:R ri:first s:S si:w mi; / preferred reviewer's prefs
if[count[rp]=sir:rp?si;:.[eSR;(Si;si);1_]]; / not on reviewer's list
/ renege if already engaged and this suitor is better
if[not count[e]=ei:e?ri;if[sir<rp?ei;eSR:.[eSR;(Si;ei);1_];e[ei]:0N]];
e[si]:ri; eSR[0]:e; / get engaged
(eSR Si;eSR[Ri;ri]):prune[eSR Si;rp;ri;si]; / assignment order matters
eSR}
Pruning Preference Vectors
 Once a suitor and reviewer are engaged, we can make two optimizations:
 Remove all reviewer preferences that are worse than the current suitor
 Remove the reviewer from all worse suitors’ preferences
 Kdb+ 4.1 pattern matching is used yet again to unpack the results
of
cut
/ given (S)uiter preferences, (r)eviewer (p)refs, and (s)uitor (i)ndice(s)
/ and (r)eviewer (i)ndice(s), return the pruned reviewer and Suitor prefs
prune:{[S;rp;ris;sis]
if[count[rp]<=i:1+max rp?sis;:(S;rp)]; / return early if nothing to do
(rp;sis):(0;i) cut rp; / drop worse suitors from preferences
S:S @[;sis;drop;]/ ris; / drop reviewers from worse suitors
(S;rp)}
Pruning Example
 Assume suitor 0 proposes to reviewer 4
 All suitors past 0 are removed from reviewer prefs
 Reviewer 4 is removed from the suitors that were cut
q)Srp:.matching.prune[S;rp:R ri;ri:4;si:0]; / prune
q)show rp / initial reviewer prefs
7 0 5 1 6 8 4 3 2 9
q)show last Srp / everything past 0 is cut
7 0
q)show first Srp / 4 is dropped from cut reviewers
8 2 4 1 6 0 5 3 7 9
6 7 3 2 9 1 0 8 5
7 1 3 9 5 0 6 8 2
1 8 5 7 0 3 6 2 9
3 8 2 9 5 7 0 1 6
1 7 6 0 2 5 3 8 9
0 6 2 8 7 9 1 3 5
6 1 7 3 4 8 5 0 2 9
9 6 1 7 0 3 8 2 5
5 8 6 9 1 0 2 7 3
Pruning Logistics
.matching.prune
handles lists of suitors and reviewers
 The Stable Roommates (SR) Problem requires the Suitor and Reviewer preferences to be the same data structure
 The HospitalResident (HR) Problem requires the function to prune the worst of multiple residents (acting as suitor) when the hospital reaches capacity
 The StudentAllocation (SA) Problem requires the function to prune multiple students (acting as suitor) and the worst of multiple projects (acting as reviewer)
/ given (S)uiter preferences, (r)eviewer (p)refs, and (s)uitor (i)ndice(s)
/ and (r)eviewer (i)ndice(s), return the pruned reviewer and Suitor prefs
prune:{[S;rp;ris;sis]
if[count[rp]<=i:1+max rp?sis;:(S;rp)]; / return early if nothing to do
(rp;sis):(0;i) cut rp; / drop worse suitors from preferences
S:S @[;sis;drop;]/ ris; / drop reviewers from worse suitors
(S;rp)}
Stable Marriage Execution
 The implementation returns the engagements as well as the remaining unpruned suitor and reviewer preferences as dictionaries
 Strictly speaking, we only need to return the engagement dictionary, but having access to the remaining preferences adds intuition
q).matching.sm[B;G]
"ABCDEFGHIJ"!"3802561749"
"ABCDEFGHIJ"!("36";"867352";"06";"264";"5684";"693";"16789";"7632";"40639";"986")
"0123456789"!("IC";,"G";"HBD";"BIFHA";"EDI";"BE";"AGCIHEJBDF";"GBH";"JGEB";"FIGJ")
Stable Marriage Vector Observations
 The
?
find operator is used in 5 different ways: Enumerate dictionary values
 Search engagement vector for the next single suitor
 Search engagement vector to see if reviewer is engaged
 Compare ranking between suitor and existing suitor
 Search reviewer preferences when pruning worse suitors
 The engagement vector remains the same length across iterations, but the preference vectors shrink as suitors are pruned
 Each iteration needs to unpack the single argument into distinct variables and then pack them back up for the next iteration
Strategy
 Suitors can not improve their results by changing their rankings
 Reviewers can (sometimes) improve their results by truncating their rankings – but risk not getting matched at all
 Reviewer “H” originally gets 8th suitor on their list
 By not permitting this matching, they (and “A” as well) improve their match
q)1+(,'/)(B?'{[k!v]v!k} first .matching.sm[G]::) each 1 @[;"H";7#]\ B
A 7 2
B 7 7
C 8 8
D 2 2
E 3 3
F 2 2
G 4 4
H 8 2
I 2 2
J 5 5
Stable Roommates (SR) Problem
 What if we only had a single population?
 Each participant is required to rank every other participant
 It is possible that no stable solution exists
Stable Roommates Algorithm
 Robert W. Irving published a 2phase solution in 1985
 Phase 1 passes the roommate preferences to the GaleShapley algorithm as both the suitor and reviewer
 Since
q
does not allow passing by pointer, the.matching.sma
function was conditioned on how many preference lists were passed  Phase 2 removes ‘cycles’ which are rotations that produce equally stable solutions
Stable Roommates Wrapper
 Kdb+ 4.1 pattern matching is used to unpack the dictionary parameter
 The preferences are enumerated
 The results are unenumerated before being passed back as a dictionary
/ given (R)oomate preference dictionary, return the (a)ssignment dictionary
/ and (R)oommate preference dictionaries from each decycle stage
sr:{[rn!rp]
aR:(count[rn]#0N;rn?rp); / initial assignment/enumerated values
aR:sra aR; / apply stable roommate (SR) algorithm
aR:rn!/:rn aR; / map enumerations back to original values
aR}
Stable Roommates Algorithm
 Phase 1 applies the stable marriage (GaleShapley) algorithm
 Kdb+ 4.1 pattern matching is used to unpack the list and throw away the first element
 The results of phase 1 are then passed to
.matching.decycle
to remove unstable cycles  A final assignment vector is prepended to the intermediate ‘decycle’ states before being returned
/ given (a)ssignment vector and (R)oomate preferences, return the completed
/ (a)ssignment vector (R)oommate preferences from each decycle stage
sra:{[aR]
(;R):sma over aR; / apply phase 1 and throw away assignments
R:decycle scan R; / apply phase 2
aR:enlist[last[R][;0]],R; / prepend assignment vector
aR}
Decycling Roommate Assignments
 The algorithm has no solution if any participant goes unmatched
 The algorithm terminates when all participants are uniquely matched
 Cycles are discovered with the
.matching.cycle
function and removed with the.matching.pruner
roommate prune function – neither of which will be discussed
/ phase 2 of the stable roommates (SR) problem removes all cycles within the
/ remaining candidates leaving the one true stable solution
decycle:{[R]
if[any 0=c:count each R;'`unstable]; / unable to match a roommate
if[count[c]=i:?[;1b] c>1;:R]; / first roommate with multiple prefs
c:cycle[R] enlist (i;R[i;0]); / build the cycle starting here
R:pruner/[R;c[;1];1 rotate c[;0]]; / prune prefs based on dropped cycle
R}
Stable Roommates Setup
 A worked example (including decycling) can be found on the Stable Roommates Problem Wikipedia page
 Each participant ranks all other participants
 Even though this example uses integers, the algorithm requires 0index enumerations so we create a dictionary and supply it to the algorithm wrapper
q)show R:(1+til count R)!R:get each read0 `wmate.txt
1 3 4 2 6 5
2 6 5 4 1 3
3 2 4 5 1 6
4 5 2 3 6 1
5 3 1 2 4 6
6 5 1 3 4 2
Stable Roommates Execution
 The
.matching.sr
function produces: the assignment dictionary
 the results of the GaleShapley algorithm
 each step of the decycling process
 Notice how the assignment dictionary is symmetric – 1 is assigned 6 and 6 is assigned 1
q).matching.sr R
1 2 3 4 5 6!6 4 5 2 3 1
1 2 3 4 5 6!(4 2 6;6 5 4 1 3;2 4 5;5 2 3 6 1;3 2 4;1 4 2)
1 2 3 4 5 6!(2 6;6 5 4 1;4 5;5 2 3;3 2 4;1 2)
1 2 3 4 5 6!(,6;5 4;4 5;2 3;3 2;,1)
1 2 3 4 5 6!(,6;,4;,5;,2;,3;,1)
Stable Roommates Vector Observations
 The
?
find operator is used two more times: Search roommate preference counts for decycle opportunities
 Search chain for ‘tail’ location so the nonrepeating section can be excluded from the cycle
HospitalResident (HR) Problem
 What if there was capacity for more than a single match?
 Conceptually the same as the SM algorithm, but needs to be generalized for multiple matches
 The hospitals, in this case, may have capacity greater than one
National Residency Matching Program
 1940s – Newly graduating MDs were being given earlier and earlier offers resulting in poor matches and/or exploding offers
 1950s – The National Residency Matching Program was created to match residents to hospitals in a hospitaloptimal stable allocation
 1998 – Matching updated to the studentoptimal RothPeranson algorithm that also permits couples to submit ranked pairs of position
 2003 – Alvin Roth published a summary of the NRMP in his paper The Origins, History, and Design of the Resident Match
 2012 – Nobel prize in Economics was given to Alvin Roth and Lloyd Shapley (David Gale had passed away in 2008).
HospitalResident Algorithm

Initialize all residents to be unmatched and hospitals to have an empty match list

HospitalOptimal
 Fill each hospital to capacity with mostpreferred residents
 Allow resident to upgrade for improved offers – forcing hospital to make nextbest offer

ResidentOptimal
 Match each resident to mostpreferred belowcapacity hospital
 Allow hospital to upgrade for improved offers – forcing resident to make nextbest offer
HospitalResident Wrapper
 The interface for both the hospitaloptimal and residentoptimal algorithms are the same and they both require the mapping from dictionaries to enumerated lists (and back again)
 Kdb+ 4.1 pattern matching is used to unpack the dictionary parameters
/ hospital resident (HR) problem wrapper function that enumerates the inputs,
/ calls the hr function and unenumerates the results
hrw:{[hrf;C;hn!hp;rn!rp]
hrHR:((count hn;0)#0N;count[rn]#0N;rn?hp;hn?rp);
hrHR:hrf[C hn] over hrHR;
hrHR:(hn;rn;hn;rn)!'(rn;hn;rn;hn)@'hrHR;
hrHR}
hrr:hrw[hrra] / hospital resident (residentoptimal)
hrh:hrw[hrha] / hospital resident (hospitaloptimal)
HospitalResident ResidentOptimal Implementation
 Kdb+ 4.1 pattern matching is used to unpack the list of
parameters used in the iteration as well as to assign the results
of
.matching.prune
 To find next available resident we limit our search to unmatched residents with viable preferences
 The
?
find operator is used again to find the first such resident  Drop student when over capacity and prune when at capacity
/ given hospital (c)apacity and (h)ospital matches, (r)esident matches,
/ (H)ospital and (R)esident preferences, find next residentoptimal match
hrra:{[c;(h;r;H;R)]
mi:?[;1b] 0<count each R w:where null r; / first unmatched with prefs
if[mi=count w;:(h;r;H;R)]; / nothing to match
hp:H hi:first R ri:w mi; / preferred hospital
if[not ri in hp;:(h;r;H;@[R;ri;1_])]; / hospital rejects
ch:count ris:h[hi],:ri; r[ri]:hi; / match
if[ch>c hi; / over capacity
wri:hp max hp?ris; / worst resident
ch:count ris:h[hi]:drop[ris;wri]; / drop resident from hospital match
r[wri]:0N; / drop resident match
];
if[ch=c hi;(R;H hi):prune[R;hp;hi;ris]]; / prune
(h;r;H;R)}
HospitalResident HospitalOptimal Implementation
 Kdb+ 4.1 pattern matching is used to unpack the list of
parameters used in the iteration as well as to assign the results
of
.matching.prune
 To find the next available hospital we ignore hospitals at capacity, then drop existing matches from hospital preferences
 The
?
find operator is used again to find the first such hospital  Prune on every match
/ given hospital (c)apacity and (h)ospital matches, (r)esident matches,
/ (H)ospital and (R)esident preferences, find next hospitaloptimal match
hrha:{[c;(h;r;H;R)]
w:where c>count each h; / limit to hospitals with capacity
mi:?[;1b] 0<count each m:H[w] except' h w; / first with unmatched prefs
if[mi=count w;:(h;r;H;R)]; / nothing to match
rp:R ri:first m mi; hi:w mi; / preferred resident
if[not hi in rp;:(h;r;@[H;hi;1_];R)]; / resident preferences
if[not null ehi:r ri; h:@[h;ehi;drop;ri]]; / drop existing match
h[hi],:ri; r[ri]:hi; / match
(H;R ri):prune[H;rp;ri;hi]; / prune
(h;r;H;R)}
HospitalResident Setup
 The Python matching package provides links to hospital capacity and hospital and resident preference data in YAML format
 Convert and store YAML files in JSON format
q)2#C:.j.k raze read0 `:capacities.json
Dewi Sant  30
Prince Charles 30
q)2#H:`$.j.k raze read0 `:hospitals.json
Dewi Sant  `093`067`136`177`060`196`197`184`156`075`092`034`111`174`171`064`022`..
Prince Charles `124`146`027`017`174`133`001`106`097`179`018`006`172`057`163`103`081`..
q)2#R:`$.j.k raze read0 `:residents.json
000 `Royal Glamorgan`Prince of Wales`Dewi Sant`Royal Gwent`Prince Charles
001 `Prince of Wales`Royal Gwent`Royal Glamorgan`University`Prince Charles`St. David
HospitalResident Execution
 Both approaches return a hospital > residents dictionary, resident > hospital dictionary as well as the pruned hospital and resident preference dictionaries
q)first hrHR:.matching.hrr[C;H;R]
Dewi Sant  `010`011`013`019`022`023`037`039`040`045`046`065`067`072`079`083`086..
Prince Charles  `007`008`009`026`027`031`034`041`044`051`059`061`069`070`087`107`110..
Prince of Wales `001`004`017`030`035`048`064`078`088`097`111`112`124`128`132`138`140..
Royal Glamorgan `000`014`015`016`018`021`024`029`033`042`053`058`073`075`076`089`096..
Royal Gwent  `002`006`028`036`054`068`071`090`091`105`120`121`141`145`155`161`163..
St. David  `005`012`020`032`043`049`056`060`063`077`084`085`092`093`094`099`101..
University  `038`047`050`052`055`057`062`074`080`082`098`100`102`103`109`122`148..
q)5#hrHR 1
000 Royal Glamorgan
001 Prince of Wales
002 Royal Gwent
003 University
004 Prince of Wales
StudentAllocation (SA) Problem
 Let’s relax the constraints once more and insert an intermediary between the suitor and reviewer
 Supervisors have projects
 Students rank projects
 Supervisors rank all students that have ranked their projects
StudentAllocation Algorithm

Initialize all students to be unmatched and supervisors and projects to have empty match lists

SupervisorOptimal
 Fill each project to capacity with mostpreferred students
 Allow student to upgrade for improved offers – forcing supervisor to make nextbest offer

StudentOptimal
 Match each student to mostpreferred belowcapacity project
 Allow supervisor to upgrade for improved offers – forcing student to make nextbest offer
StudentAllocation Wrapper
 The interface for both the supervisoroptimal and studentoptimal algorithms are, once again, the same and they both require the mapping from dictionaries to enumerated lists (and back again)
 Kdb+ 4.1 pattern matching is used to unpack the dictionary parameters
/ studentallocation (SA) problem wrapper function that enumerates the
/ inputs, calls the sa function and unenumerates the results
saw:{[saf;PC;UC;pn!pu;un!up;sn!sp]
pusUS:((count pn;0)#0N;(count un;0)#0N;count[sn]#0N;sn?up;pn?sp);
pusUS:saf[PC pn;UC un;un?pu] over pusUS;
pusUS:(pn;un;sn;un;sn)!'(sn;sn;pn;sn;pn)@'pusUS;
pusUS}
sas:saw[sasa] / studentallocation (studentoptimal)
sau:saw[saua] / studentallocation (supervisoroptimal)
StudentAllocation StudentOptimal Implementation
 Kdb+ 4.1 pattern matching is used to unpack the list of
parameters used in the iteration as well as to assign the last
result of
.matching.prune
 Limit search to unmatched students with viable preferences
 The
?
find operator is used again to find the first such student  Drop student when over capacity and prune when at capacity
/ given (p)roject (c)apacity, s(u)pervisor (c)apacity, (p)roject to
/ s(u)pervisor map and (p)roject matches, s(u)pervisor matches, (s)tudent
/ matches, s(U)pervisor preferences and (S)tudent preferences, find next
/ studentoptimal match
sasa:{[pc;uc;pu;(p;u;s;U;S)]
mi:?[;1b] 0<count each S w:where null s; / first unmatched student
if[mi=count w;:(p;u;s;U;S)]; / nothing to match
up:U ui:pu pi:first S si:w mi; / preferred project's supervisors preferences
cu:count usis:u[ui],:si;cp:count psis:p[pi],:si;s[si]:pi; / match
if[cp>pc pi; / project over capacity
wsi:up max up?psis; s[wsi]:0N; / worst student
cp:count psis:p[pi]:drop[psis;wsi]; / drop from project
cu:count usis:u[ui]:drop[usis;wsi]; / drop from supervisor
];
if[cu>uc ui; / supervisor over capacity
wsi:up max up?usis; / worst student
p:@[p;s wsi;drop;wsi]; s[wsi]:0N; / drop from other project
cu:count usis:u[ui]:drop[usis;wsi]; / drop from supervisor
];
if[cp=pc pi;(S;):prune[S;up;pi;psis]]; / prune
if[cu=uc ui;(S;U ui):prune[S;up;where pu=ui;usis]];
(p;u;s;U;S)}
StudentAllocation SupervisorOptimal Implementation
 Kdb+ 4.1 pattern matching is used to unpack the list of
parameters used in the iteration as well as to assign the last
result of
.matching.prune
 The
.matching.nextusp
function is used to find the next available supervisor, student and project to match  Iterate until either a match is found or no matches are available
 Iteration passes the supervisor index and increments each time
/ given (p)roject (c)apacity, s(u)pervisor (c)apacity, (p)roject to
/ s(u)pervisor map and (p)roject matches, s(u)pervisor matches, (s)tudent
/ matches, s(U)pervisor preferences and (S)tudent preferences, find next
/ supervisoroptimal match
saua:{[pc;uc;pu;(p;u;s;U;S)]
ubc:uc>count each u; / supervisors below capacity
pbc:pc>count each p; / projects below capacity
usp:(1=count::) nextusp[pbc;ubc;pu;p;S;U]/ 0; / iterate across supervisors
if[not count usp;:(p;u;s;U;S)]; / no further matches found
(ui;si;pi):usp; / unpack
if[not null epi:s si; u:@[u;pu epi;drop;si]; p:@[p;epi;drop;si]]; / drop
u[ui],:si; p[pi],:si; s[si]:pi; / match
(;S si):prune[U;S si;();pi]; / prune
(p;u;s;U;S)}
StudentAllocation Supervisor Search
 Finding the next supervisor’s favorite student’s favorite supervisor’s project is the slowest function
 The python implementation has a triplenested
for
loop and breaks out immediately upon success
/ given (p)rojects (b)elow (c)apacity boolean vector, s(u)pervisors (b)elow
/ (c)apacity vector, (p)roject to s(u)pervisor map, (p)roject matches,
/ (S)tudent preferences, s(U)pervisor preferences and a single s(u)pervisor
/ (i)ndex, return the s(u)pervisor's preferred (s)tudent and their preferred
/ (p)roject (that is mapped to the supervisor) as a triplet (u;s;p). if no
/ match is found, return the next supervisor index ui. return an empty list
/ if all supervisors have been exhausted.
nextusp:{[pbc;ubc;pu;p;S;U;ui]
if[ui=count U;:()]; / no more supervisors
if[not ubc ui;:ui+1]; / supervisor at capacity
pis:S sis:U ui; / unpack students and their projects
pis:pis@'where each (pbc&ui=pu) pis; / supervisor's projects with capacity
pis:pis@'where each not sis (in/:)' p pis; / not already matched
if[not count sp:raze sis (,/:)' pis;:ui+1]; / (student;project)
usp:ui,first sp; / (supervisor;student;project)
usp}
StudentAllocation Setup
 The Python matching package provides links to student, project and supervisor capacity and preference data in CSV format
q)2#s:2!("JJ",(2+count first x)#"S";1#",") 0: x:read0 `:students.csv
name rank 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 ..
 ..
190000 3  G2 P1 V2 S0 A0 O0 L0 D2 K1 V1 R2 G2 Y2 G2 W0 K0 X0 O1 ..
190001 56  Q0 P1 P0 M1 N0 P1 T2 N1 I1 K0 P3 X1 F0 P1 S0 C0 Z0 L0 H2 ..
q)2#p:("SJS";1#",") 0: `:projects.csv
code capacity supervisor

A0 2 A
A1 3 A
q)2#u:("SJ";1#",") 0: `:supervisors.csv
name capacity

A 3
B 1
StudentAllocation Execution
 Both approaches return supervisor > student, project > student and student > project dictionaries as well as the pruned supervisor and student preference dictionaries
q)d:preprocess[u;p;s]
q)5#first pusUS:.matching.sas . d`PC`UC`PU`U`S
A0 `long$()
A1 190019 190034
A2 ,190017
B0 `long$()
B1 ,190091
q)5#pusUS 1
A 190019 190034 190017
B ,190091
C 190003 190062 190068 190079 190070
D 190008 190009 190015 190039 190056
E 190022 190063
q)5#pusUS 2
190000 G2
190001 Q0
190002 U0
190003 C0
190004 I0
Performance
The key to performance is elegance, not battalions of special cases – Jon Bentley and Doug McIlroy
Code Profiling
 When the whole implementation is ~15 lines of code, we need a lineprofiler

Leslie Goldsmith created the
qprof
lineprofiler in 2015q)\l prof.q q).prof.prof `.matching q)\ts:100 eSR:.matching.sm[G;B] 82 8672 q).prof.report` Name Line Stmt Count Total Own Pct  .matching.prune 3 S:S@[;sis;drop;]/ris; 1400 00:00.024 00:00.018 22.97% .matching.sma 9 (eSR Si;eSR[Ri;ri]):prune[eSR 1600 00:00.043 00:00.009 11.83% .matching.drop 0 x _ x?y 5200 00:00.005 00:00.005 6.28% .matching.sm 2 eSR:sma over eSR; 100 00:00.081 00:00.005 6.65% .matching.sma 2 mi:?[;1b]0<count each S w:wher 1700 00:00.005 00:00.005 6.61% .matching.prune 2 (rp;sis):(0;i)cut rp; 1400 00:00.004 00:00.004 5.61% .matching.sma 3 if[mi=count w;:eSR]; 1700 00:00.004 00:00.004 4.89% .matching.sma 6 if[not count[e]=ei:e?ri;if[sir 1600 00:00.004 00:00.004 5.47% .matching.sma 8 e[si]:ri;eSR[0]:e; 1600 00:00.004 00:00.004 5.41% .matching.prune 4 (S;rp) 1400 00:00.003 00:00.003 4.03% .matching.sma 4 rp:R ri:first s:S si:w mi; 1600 00:00.003 00:00.003 4.68% .matching.sma 5 if[count[rp]=sir:rp?si;:.[eSR; 1600 00:00.003 00:00.003 4.73% .matching.sma 10 eSR 1600 00:00.003 00:00.003 4.49% .matching.prune 0 if[count[rp]<=i:1+max rp?sis;: 1600 00:00.002 00:00.002 2.64% .matching.sma 0 e:eSR 0;S:eSR Si:1;R:eSR Ri:1 1700 00:00.002 00:00.002 2.49% .matching.sm 0 eSR:(count[sn]#0N;rn?sp;sn?rp) 100 00:00.000 00:00.000 0.37% .matching.sm 3 eSR:(sn;sn;rn)!'(rn;rn;sn)@'eS 100 00:00.000 00:00.000 0.49% .matching.sm 4 eSR 100 00:00.000 00:00.000 0.27%
Timing Setup
 Using PyKX we can access both implementations from python

Need to increase recursion limit due to
copy.deepcopy
callsys.setrecursionlimit(10000) # overcome call to copy.deepcopy
Timing Result
 Using the
timeit
package we can compare the execution times 
Python implementation is ~10x slower than q (and worsens with increased dimensions)
>>> assert smq(sd, rd) == smp(sd, rd) # assert equality >>> timeit.timeit('smq(sd,rd)', number=1, globals=globals()) 0.05793206300000975 >>> timeit.timeit('smp(sd,rd)', number=1, globals=globals()) 0.5015301169999589
Q Enhancements
Nothing happens unless first we dream – Carl Sandburg
Well, the the J mentality is that only J is needed and they’re right – Marshall Lochbaum “Naming is Hard” The Array Cast
PyKX Type Handling (SOLVED)!

The python lists (not just np arrays) and dictionary keys are promoted to uniform Kdb+ vectors
>>> kx.q("0N!",{1:[1,2],2:[2,3]}) 1 2!(1 2;2 3) pykx.Dictionary(pykx.q(' 1 1 2 2 2 3 '))
Pattern Matching (SOLVED)!

Using
over
andscan
requires packing and unpacking complex state for each iterationp:pusUS 0;u:pusUS 1;s:pusUS 2;U:pusUS 3;S:pusUS 4;

Pattern matching makes this much more elegant
q)(p;u;s;U;S):pusUS;
Native assert
(SOLVED)!
 Unit tests require an
assert
function that throws an exception when the assertion fails  We can use Kdb+ 4.1 pattern matching as a lightweight assert
q)(1b):{[a:1;b:2]} ~ {[a:1;b:1]}
'match
[0] (1b):{[a:1;b:2]} ~ {[a:1;b:1]}
^
YAML Support (NOT SOLVED)
 The hospitalresident problem capacities, hospitals and residents inputs are stored in YAML files
KDB Insights
configuration is stored in YAML files YAML supports comments
 YAML supports more types than JSON (booleans, dates, timestamps, null)
q).y.k "\n" sv read0 `:hospitals.yml
Summary
 Which of these algorithms matches my intern problem?
 Did I get assigned more appropriate interns?
 Deferred acceptance algorithms appear in real life
 Vector implementations are faster than objectoriented ones
 The algorithms are heavily reliant on the
?
find operator  Kdb+ 4.1 pattern matching is used to unpack dictionary and list function arguments as well as the results from functions
 The
q
matching
library can be found on github: https://github.com/psaris/matching/releases/tag/hkmeetup  This presentation can be found at https://nick.psaris.com