Table of Contents

Meetup Details

2024-02-28 Hong Kong Meetup

Motivation

  • Intern-desk pairing
  • Two distinct populations
  • Interns rank desks
  • Desks rank interns
  • Prevent back-room deals
  • Is there a single optimal pairing?

Simple Example

q)show S:([A:`D`E`F;B:`D`F`E;C:`F`D`E]) / suitor preferences
A| D E F
B| D F E
C| F D E
q)show R:([D:`B`C`A;E:`A`C`B;F:`C`B`A]) / reviewer preferences
D| B C A
E| A C B
F| C B A
q)show E:([A:`E;B:`D;C:`F])             / engagements
A| E
B| D
C| F
  • Confirm algorithm produces optimal engagements
q)E~first .matching.sm[S;R]
1b

Dictionary Literal Digression

  • Intuitively extends the table and list syntax

    q)([a:1;b:2])
    a| 1
    b| 2
    
  • Keys are always symbols and there is no zero-column table

    q)0N!([])
    (`symbol$())!()
    
  • There is now an elegant way to create a single element dictionary!

    q)enlist[`k]!enlist`v
    k| v
    q)([k:`v])
    k| v
    

Engagement Demonstration

  • Two sets of index cards
  • Qgods: 0-9
  • Qbies: A-J
  • Pair-off so everyone is ‘happy’
  • How did we do?

Qgods and Qbies

q)show g:.Q.n                     / define qgods
"0123456789"
q)show b:10#.Q.A                  / define qbies
"ABCDEFGHIJ"

Qgod and Qbie Preferences

q)\S -314159i                   / reset seed
q)show G:g!-10?/:10#enlist b    / define qgod prefs
0| "ICEBGAFDHJ"
1| "GHDCJBAIEF"
2| "HBDJFAEGIC"
3| "BIFHADGECJ"
4| "EDICJFHABG"
5| "BEHGACFDIJ"
6| "AGCIHEJBDF"
7| "GBHDEIFACJ"
8| "JGEBHADICF"
9| "FIGJBACHED"
q)show B:b!-10?/:10#enlist g    / define qbie prefs
A| "9378126054"
B| "8014967352"
C| "7284539016"
D| "3217965084"
E| "7051684329"
F| "6943720851"
G| "4501672893"
H| "7681405392"
I| "7458106239"
J| "9154860327"

Qgod-Optimal Matches

  • The suitor obtains the best possible match among all stable matches
q)geSR:.matching.sm[G;B]          / perform qgod-optimal match
q)geSR 0                          / engagements
0| C
1| G
2| D
3| H
4| I
5| E
6| A
7| B
8| J
9| F
q)geSR 1                          / remaining qgod prefs
0| "CEBGH"
1| "GHJBA"
2| "DAC"
3| "HADC"
4| "ICJHBG"
5| "EHGCJ"
6| "AHBF"
7| "BHEIAC"
8| "JBHAC"
9| "FJBAC"

Qbie Matches

  • The reviewer obtains the worst possible match among all stable matches
q)geSR 0                          / engagements
0| C
1| G
2| D
3| H
4| I
5| E
6| A
7| B
8| J
9| F
q)geSR 2                          / remaining qbie prefs
A| "9378126"
B| "8014967"
C| "72845390"
D| "32"
E| "705"
F| "69"
G| "4501"
H| "76814053"
I| "74"
J| "91548"

Qbie-Optimal Matches

  • Allowing Qbies to propose first improves their matches
  • Matches are still stable
q)beSR:.matching.sm[B;G]          / perform qbie-optimal match
q)beSR 0                          / engagements
A| 3
B| 8
C| 0
D| 2
E| 5
F| 6
G| 1
H| 7
I| 4
J| 9

Uniqueness

  • The Qgod-optimal and Qbie-optimal engagements are not always the same
  • Use Kdb+ 4.1 pattern matching to invert the Qbie dictionary
q)geSR[0],'geSR[0] = {[k!v]v!k} beSR 0
0| "C" 1b
1| "G" 1b
2| "D" 1b
3| "H" 0b
4| "I" 1b
5| "E" 1b
6| "A" 0b
7| "B" 0b
8| "J" 0b
9| "F" 0b

Roommate Demonstration

  • One set of index cards
  • Qroomies: A-J
  • Pair-off so everyone is ‘happy’
  • How did we do?

Qroomie Preferences

  • Roommates rank everyone else but themselves
q)show R:b!value[G] except' b     / define qroomie prefs
A| "ICEBGFDHJ"
B| "GHDCJAIEF"
C| "HBDJFAEGI"
D| "BIFHAGECJ"
E| "DICJFHABG"
F| "BEHGACDIJ"
G| "ACIHEJBDF"
H| "GBDEIFACJ"
I| "JGEBHADCF"
J| "FIGBACHED"

Qroomie Matches

  • The first stage of the algorithm can produce multiple solutions
  • The second stage of the algorithm removes ‘cycles’
q)show reR:.matching.sr R
"ABCDEFGHIJ"!"GHFEDCABJI"
"ABCDEFGHIJ"!("CG";"HD";"FA";"BE";"DF";"EC";"AH";"GB";,"J";,"I")
"ABCDEFGHIJ"!(,"G";,"H";,"F";,"E";,"D";,"C";,"A";,"B";,"J";,"I")
q)reR 1
A| "CG"
B| "HD"
C| "FA"
D| "BE"
E| "DF"
F| "EC"
G| "AH"
H| "GB"
I| ,"J"
J| ,"I"

Stable Marriage (SM) Problem

Given two distinct populations how do you create matches such that no pair prefers each other over their current matching?

Stable Marriage Algorithm

The Gale-Shapley 1962 (Deferred Acceptance) algorithm:

  • All participants rank partners
  • Iteratively engage each suitor:
    • Return early if every suitor is engaged
    • Find preferred reviewer for next single suitor
    • If reviewer is single, they accepted suitor
    • Else, allow reviewer to renege and upgrade – old suitor gets to try again
    • Return updated engagement vector, and suitor and reviewer preference vectors

Stable Marriage Theorems

  • The algorithm completes in a finite number of steps
  • The algorithm terminates in at most \(n^2 - n + 1\) iterations
  • The algorithm always produces stable matches
  • The matches are always suitor-optimal (and reviewer-pessimal)
  • The matches are unique if suitor-optimal and reviewer-optimal results are identical

Enumerating Preference Maps

  • Humans prefer names
  • Algorithms prefer indices
  • We convert ranking dictionaries to 0-based lists by enumerating each value with the ? find operator
q)G                               / qgod prefs
0| "ICEBGAFDHJ"
1| "GHDCJBAIEF"
2| "HBDJFAEGIC"
3| "BIFHADGECJ"
4| "EDICJFHABG"
5| "BEHGACFDIJ"
6| "AGCIHEJBDF"
7| "GBHDEIFACJ"
8| "JGEBHADICF"
9| "FIGJBACHED"
q)key B                           / qbie enumeration vector
"ABCDEFGHIJ"
q)show S:key[B]?value G           / qgod enumerations
8 2 4 1 6 0 5 3 7 9
6 7 3 2 9 1 0 8 4 5
7 1 3 9 5 0 4 6 8 2
1 8 5 7 0 3 6 4 2 9
4 3 8 2 9 5 7 0 1 6
1 4 7 6 0 2 5 3 8 9
0 6 2 8 7 4 9 1 3 5
6 1 7 3 4 8 5 0 2 9
9 6 4 1 7 0 3 8 2 5
5 8 6 9 1 0 2 7 4 3
q)R:key[G]?value B                / gbie enumerations

Stable Marriage Wrapper

  • Use Kdb+ 4.1 pattern matching to unpack dictionary parameters
  • Enumerate the suitor and reviewer dictionaries
  • Build all-null engagement vector
  • Iterate with .matching.sma until convergence
  • Convert enumerations back to dictionaries
/ given (S)uitor and (R)eviewer preferences, return the (e)ngagement
/ dictionary and remaining (S)uitor and (R)eviewer preferences for inspection
sm:{[sn!sp;rn!rp]
 eSR:(count[sn]#0N;rn?sp;sn?rp);  / initial state/enumerated values
 eSR:sma over eSR;                / iteratively apply Gale-Shapley algorithm
 eSR:(sn;sn;rn)!'(rn;rn;sn)@'eSR; / map enumerations back to original values
 eSR}

Stable Marriage Implementation

  • Manually unpack arguments because SR may be a single matrix or a pair of matrices
  • The suitor and reviewer indices – Si and Ri respectively – are defined as variables so that the same function can be used for the Stable Roommates (SR) Problem
  • Use Kdb+ 4.1 pattern matching to unpack .matching.prune results
/ given (e)ngagement vector and (S)uitor and (R)eviewer preferences, find
/ next engagement, remove undesirable suitors and unavailable reviewers.
/ roommate preferences are assumed if (R)eviewer preferences are not
/ provided.
sma:{[eSR]
 e:eSR 0;S:eSR Si:1;R:eSR Ri:-1+count eSR;   / manually unpack
 mi:?[;1b] 0<count each S w:where null e;    / first unmatched with prefs
 if[mi=count w;:eSR];                        / no unmatched suitor
 rp:R ri:first s:S si:w mi;                  / preferred reviewer's prefs
 if[count[rp]=sir:rp?si;:.[eSR;(Si;si);1_]]; / not on reviewer's list
 / renege if already engaged and this suitor is better
 if[not count[e]=ei:e?ri;if[sir<rp?ei;eSR:.[eSR;(Si;ei);1_];e[ei]:0N]];
 e[si]:ri; eSR[0]:e;                         / get engaged
 (eSR Si;eSR[Ri;ri]):prune[eSR Si;rp;ri;si]; / assignment order matters
 eSR}

Pruning Preference Vectors

  • Once a suitor and reviewer are engaged, we can make two optimizations:
    1. Remove all reviewer preferences that are worse than the current suitor
    2. Remove the reviewer from all worse suitors’ preferences
  • Kdb+ 4.1 pattern matching is used yet again to unpack the results of cut
/ given (S)uiter preferences, (r)eviewer (p)refs, and (s)uitor (i)ndice(s)
/ and (r)eviewer (i)ndice(s), return the pruned reviewer and Suitor prefs
prune:{[S;rp;ris;sis]
 if[count[rp]<=i:1+max rp?sis;:(S;rp)]; / return early if nothing to do
 (rp;sis):(0;i) cut rp;                 / drop worse suitors from preferences
 S:S @[;sis;drop;]/ ris;                / drop reviewers from worse suitors
 (S;rp)}

Pruning Example

  • Assume suitor 0 proposes to reviewer 4
  • All suitors past 0 are removed from reviewer prefs
  • Reviewer 4 is removed from the suitors that were cut
q)Srp:.matching.prune[S;rp:R ri;ri:4;si:0]; / prune
q)show rp                                   / initial reviewer prefs
7 0 5 1 6 8 4 3 2 9
q)show last Srp                             / everything past 0 is cut
7 0
q)show first Srp                  / 4 is dropped from cut reviewers
8 2 4 1 6 0 5 3 7 9
6 7 3 2 9 1 0 8 5
7 1 3 9 5 0 6 8 2
1 8 5 7 0 3 6 2 9
3 8 2 9 5 7 0 1 6
1 7 6 0 2 5 3 8 9
0 6 2 8 7 9 1 3 5
6 1 7 3 4 8 5 0 2 9
9 6 1 7 0 3 8 2 5
5 8 6 9 1 0 2 7 3

Pruning Logistics

.matching.prune handles lists of suitors and reviewers

/ given (S)uiter preferences, (r)eviewer (p)refs, and (s)uitor (i)ndice(s)
/ and (r)eviewer (i)ndice(s), return the pruned reviewer and Suitor prefs
prune:{[S;rp;ris;sis]
 if[count[rp]<=i:1+max rp?sis;:(S;rp)]; / return early if nothing to do
 (rp;sis):(0;i) cut rp;                 / drop worse suitors from preferences
 S:S @[;sis;drop;]/ ris;                / drop reviewers from worse suitors
 (S;rp)}

Stable Marriage Execution

  • The implementation returns the engagements as well as the remaining unpruned suitor and reviewer preferences as dictionaries
  • Strictly speaking, we only need to return the engagement dictionary, but having access to the remaining preferences adds intuition
q).matching.sm[B;G]
"ABCDEFGHIJ"!"3802561749"
"ABCDEFGHIJ"!("36";"867352";"06";"264";"5684";"693";"16789";"7632";"40639";"986")
"0123456789"!("IC";,"G";"HBD";"BIFHA";"EDI";"BE";"AGCIHEJBDF";"GBH";"JGEB";"FIGJ")

Stable Marriage Vector Observations

  • The ? find operator is used in 5 different ways:
    1. Enumerate dictionary values
    2. Search engagement vector for the next single suitor
    3. Search engagement vector to see if reviewer is engaged
    4. Compare ranking between suitor and existing suitor
    5. Search reviewer preferences when pruning worse suitors
  • The engagement vector remains the same length across iterations, but the preference vectors shrink as suitors are pruned
  • Each iteration needs to unpack the single argument into distinct variables and then pack them back up for the next iteration

Strategy

  • Suitors can not improve their results by changing their rankings
  • Reviewers can (sometimes) improve their results by truncating their rankings – but risk not getting matched at all
  • Reviewer “H” originally gets 8th suitor on their list
  • By not permitting this matching, they (and “A” as well) improve their match
q)1+(,'/)(B?'{[k!v]v!k} first .matching.sm[G]::) each 1 @[;"H";7#]\ B
A| 7 2
B| 7 7
C| 8 8
D| 2 2
E| 3 3
F| 2 2
G| 4 4
H| 8 2
I| 2 2
J| 5 5

Stable Roommates (SR) Problem

  • What if we only had a single population?
  • Each participant is required to rank every other participant
  • It is possible that no stable solution exists

Stable Roommates Algorithm

  • Robert W. Irving published a 2-phase solution in 1985
  • Phase 1 passes the roommate preferences to the Gale-Shapley algorithm as both the suitor and reviewer
  • Since q does not allow passing by pointer, the .matching.sma function was conditioned on how many preference lists were passed
  • Phase 2 removes ‘cycles’ which are rotations that produce equally stable solutions

Stable Roommates Wrapper

  • Kdb+ 4.1 pattern matching is used to unpack the dictionary parameter
  • The preferences are enumerated
  • The results are unenumerated before being passed back as a dictionary
/ given (R)oomate preference dictionary, return the (a)ssignment dictionary
/ and (R)oommate preference dictionaries from each decycle stage
sr:{[rn!rp]
 aR:(count[rn]#0N;rn?rp);       / initial assignment/enumerated values
 aR:sra aR;                     / apply stable roommate (SR) algorithm
 aR:rn!/:rn aR;                 / map enumerations back to original values
 aR}

Stable Roommates Algorithm

  • Phase 1 applies the stable marriage (Gale-Shapley) algorithm
  • Kdb+ 4.1 pattern matching is used to unpack the list and throw away the first element
  • The results of phase 1 are then passed to .matching.decycle to remove unstable cycles
  • A final assignment vector is prepended to the intermediate ‘decycle’ states before being returned
/ given (a)ssignment vector and (R)oomate preferences, return the completed
/ (a)ssignment vector (R)oommate preferences from each decycle stage
sra:{[aR]
 (;R):sma over aR;              / apply phase 1 and throw away assignments
 R:decycle scan R;              / apply phase 2
 aR:enlist[last[R][;0]],R;      / prepend assignment vector
 aR}

Decycling Roommate Assignments

  • The algorithm has no solution if any participant goes unmatched
  • The algorithm terminates when all participants are uniquely matched
  • Cycles are discovered with the .matching.cycle function and removed with the .matching.pruner roommate prune function – neither of which will be discussed
/ phase 2 of the stable roommates (SR) problem removes all cycles within the
/ remaining candidates leaving the one true stable solution
decycle:{[R]
 if[any 0=c:count each R;'`unstable]; / unable to match a roommate
 if[count[c]=i:?[;1b] c>1;:R];        / first roommate with multiple prefs
 c:cycle[R] enlist (i;R[i;0]);        / build the cycle starting here
 R:pruner/[R;c[;1];-1 rotate c[;0]];  / prune prefs based on dropped cycle
 R}

Stable Roommates Setup

  • A worked example (including decycling) can be found on the Stable Roommates Problem Wikipedia page
  • Each participant ranks all other participants
  • Even though this example uses integers, the algorithm requires 0-index enumerations so we create a dictionary and supply it to the algorithm wrapper
q)show R:(1+til count R)!R:get each read0 `wmate.txt
1| 3 4 2 6 5
2| 6 5 4 1 3
3| 2 4 5 1 6
4| 5 2 3 6 1
5| 3 1 2 4 6
6| 5 1 3 4 2

Stable Roommates Execution

  • The .matching.sr function produces:
    • the assignment dictionary
    • the results of the Gale-Shapley algorithm
    • each step of the decycling process
  • Notice how the assignment dictionary is symmetric – 1 is assigned 6 and 6 is assigned 1
q).matching.sr R
1 2 3 4 5 6!6 4 5 2 3 1
1 2 3 4 5 6!(4 2 6;6 5 4 1 3;2 4 5;5 2 3 6 1;3 2 4;1 4 2)
1 2 3 4 5 6!(2 6;6 5 4 1;4 5;5 2 3;3 2 4;1 2)
1 2 3 4 5 6!(,6;5 4;4 5;2 3;3 2;,1)
1 2 3 4 5 6!(,6;,4;,5;,2;,3;,1)

Stable Roommates Vector Observations

  • The ? find operator is used two more times:
    1. Search roommate preference counts for decycle opportunities
    2. Search chain for ‘tail’ location so the non-repeating section can be excluded from the cycle

Hospital-Resident (HR) Problem

  • What if there was capacity for more than a single match?
  • Conceptually the same as the SM algorithm, but needs to be generalized for multiple matches
  • The hospitals, in this case, may have capacity greater than one

National Residency Matching Program

  • 1940s – Newly graduating MDs were being given earlier and earlier offers resulting in poor matches and/or exploding offers
  • 1950s – The National Residency Matching Program was created to match residents to hospitals in a hospital-optimal stable allocation
  • 1998 – Matching updated to the student-optimal Roth-Peranson algorithm that also permits couples to submit ranked pairs of position
  • 2003 – Alvin Roth published a summary of the NRMP in his paper The Origins, History, and Design of the Resident Match
  • 2012 – Nobel prize in Economics was given to Alvin Roth and Lloyd Shapley (David Gale had passed away in 2008).

Hospital-Resident Algorithm

  • Initialize all residents to be unmatched and hospitals to have an empty match list

  • Hospital-Optimal

    • Fill each hospital to capacity with most-preferred residents
    • Allow resident to upgrade for improved offers – forcing hospital to make next-best offer
  • Resident-Optimal

    • Match each resident to most-preferred below-capacity hospital
    • Allow hospital to upgrade for improved offers – forcing resident to make next-best offer

Hospital-Resident Wrapper

  • The interface for both the hospital-optimal and resident-optimal algorithms are the same and they both require the mapping from dictionaries to enumerated lists (and back again)
  • Kdb+ 4.1 pattern matching is used to unpack the dictionary parameters
/ hospital resident (HR) problem wrapper function that enumerates the inputs,
/ calls the hr function and unenumerates the results
hrw:{[hrf;C;hn!hp;rn!rp]
 hrHR:((count hn;0)#0N;count[rn]#0N;rn?hp;hn?rp);
 hrHR:hrf[C hn] over hrHR;
 hrHR:(hn;rn;hn;rn)!'(rn;hn;rn;hn)@'hrHR;
 hrHR}

hrr:hrw[hrra]                  / hospital resident (resident-optimal)
hrh:hrw[hrha]                  / hospital resident (hospital-optimal)

Hospital-Resident Resident-Optimal Implementation

  • Kdb+ 4.1 pattern matching is used to unpack the list of parameters used in the iteration as well as to assign the results of .matching.prune
  • To find next available resident we limit our search to unmatched residents with viable preferences
  • The ? find operator is used again to find the first such resident
  • Drop student when over capacity and prune when at capacity
/ given hospital (c)apacity and (h)ospital matches, (r)esident matches,
/ (H)ospital and (R)esident preferences, find next resident-optimal match
hrra:{[c;(h;r;H;R)]
 mi:?[;1b] 0<count each R w:where null r; / first unmatched with prefs
 if[mi=count w;:(h;r;H;R)];               / nothing to match
 hp:H hi:first R ri:w mi;                 / preferred hospital
 if[not ri in hp;:(h;r;H;@[R;ri;1_])];    / hospital rejects
 ch:count ris:h[hi],:ri; r[ri]:hi;        / match
 if[ch>c hi;                              / over capacity
  wri:hp max hp?ris;                      / worst resident
  ch:count ris:h[hi]:drop[ris;wri]; / drop resident from hospital match
  r[wri]:0N;                        / drop resident match
  ];
 if[ch=c hi;(R;H hi):prune[R;hp;hi;ris]]; / prune
 (h;r;H;R)}

Hospital-Resident Hospital-Optimal Implementation

  • Kdb+ 4.1 pattern matching is used to unpack the list of parameters used in the iteration as well as to assign the results of .matching.prune
  • To find the next available hospital we ignore hospitals at capacity, then drop existing matches from hospital preferences
  • The ? find operator is used again to find the first such hospital
  • Prune on every match
/ given hospital (c)apacity and (h)ospital matches, (r)esident matches,
/ (H)ospital and (R)esident preferences, find next hospital-optimal match
hrha:{[c;(h;r;H;R)]
 w:where c>count each h;        / limit to hospitals with capacity
 mi:?[;1b] 0<count each m:H[w] except' h w; / first with unmatched prefs
 if[mi=count w;:(h;r;H;R)];                 / nothing to match
 rp:R ri:first m mi; hi:w mi;               / preferred resident
 if[not hi in rp;:(h;r;@[H;hi;1_];R)];      / resident preferences
 if[not null ehi:r ri; h:@[h;ehi;drop;ri]]; / drop existing match
 h[hi],:ri; r[ri]:hi;                       / match
 (H;R ri):prune[H;rp;ri;hi];                / prune
 (h;r;H;R)}

Hospital-Resident Setup

q)2#C:.j.k raze read0 `:capacities.json
Dewi Sant     | 30
Prince Charles| 30
q)2#H:`$.j.k raze read0 `:hospitals.json
Dewi Sant     | `093`067`136`177`060`196`197`184`156`075`092`034`111`174`171`064`022`..
Prince Charles| `124`146`027`017`174`133`001`106`097`179`018`006`172`057`163`103`081`..
q)2#R:`$.j.k raze read0 `:residents.json
000| `Royal Glamorgan`Prince of Wales`Dewi Sant`Royal Gwent`Prince Charles
001| `Prince of Wales`Royal Gwent`Royal Glamorgan`University`Prince Charles`St. David

Hospital-Resident Execution

  • Both approaches return a hospital -> residents dictionary, resident -> hospital dictionary as well as the pruned hospital and resident preference dictionaries
q)first hrHR:.matching.hrr[C;H;R]
Dewi Sant      | `010`011`013`019`022`023`037`039`040`045`046`065`067`072`079`083`086..
Prince Charles | `007`008`009`026`027`031`034`041`044`051`059`061`069`070`087`107`110..
Prince of Wales| `001`004`017`030`035`048`064`078`088`097`111`112`124`128`132`138`140..
Royal Glamorgan| `000`014`015`016`018`021`024`029`033`042`053`058`073`075`076`089`096..
Royal Gwent    | `002`006`028`036`054`068`071`090`091`105`120`121`141`145`155`161`163..
St. David      | `005`012`020`032`043`049`056`060`063`077`084`085`092`093`094`099`101..
University     | `038`047`050`052`055`057`062`074`080`082`098`100`102`103`109`122`148..
q)5#hrHR 1
000| Royal Glamorgan
001| Prince of Wales
002| Royal Gwent
003| University
004| Prince of Wales

Student-Allocation (SA) Problem

  • Let’s relax the constraints once more and insert an intermediary between the suitor and reviewer
  • Supervisors have projects
  • Students rank projects
  • Supervisors rank all students that have ranked their projects

Student-Allocation Algorithm

  • Initialize all students to be unmatched and supervisors and projects to have empty match lists

  • Supervisor-Optimal

    • Fill each project to capacity with most-preferred students
    • Allow student to upgrade for improved offers – forcing supervisor to make next-best offer
  • Student-Optimal

    • Match each student to most-preferred below-capacity project
    • Allow supervisor to upgrade for improved offers – forcing student to make next-best offer

Student-Allocation Wrapper

  • The interface for both the supervisor-optimal and student-optimal algorithms are, once again, the same and they both require the mapping from dictionaries to enumerated lists (and back again)
  • Kdb+ 4.1 pattern matching is used to unpack the dictionary parameters
/ student-allocation (SA) problem wrapper function that enumerates the
/ inputs, calls the sa function and unenumerates the results
saw:{[saf;PC;UC;pn!pu;un!up;sn!sp]
 pusUS:((count pn;0)#0N;(count un;0)#0N;count[sn]#0N;sn?up;pn?sp);
 pusUS:saf[PC pn;UC un;un?pu] over pusUS;
 pusUS:(pn;un;sn;un;sn)!'(sn;sn;pn;sn;pn)@'pusUS;
 pusUS}

sas:saw[sasa]                   / student-allocation (student-optimal)
sau:saw[saua]                   / student-allocation (supervisor-optimal)

Student-Allocation Student-Optimal Implementation

  • Kdb+ 4.1 pattern matching is used to unpack the list of parameters used in the iteration as well as to assign the last result of .matching.prune
  • Limit search to unmatched students with viable preferences
  • The ? find operator is used again to find the first such student
  • Drop student when over capacity and prune when at capacity
/ given (p)roject (c)apacity, s(u)pervisor (c)apacity, (p)roject to
/ s(u)pervisor map and (p)roject matches, s(u)pervisor matches, (s)tudent
/ matches, s(U)pervisor preferences and (S)tudent preferences, find next
/ student-optimal match
sasa:{[pc;uc;pu;(p;u;s;U;S)]
 mi:?[;1b] 0<count each S w:where null s; / first unmatched student
 if[mi=count w;:(p;u;s;U;S)];             / nothing to match
 up:U ui:pu pi:first S si:w mi; / preferred project's supervisors preferences
 cu:count usis:u[ui],:si;cp:count psis:p[pi],:si;s[si]:pi; / match
 if[cp>pc pi;                         / project over capacity
  wsi:up max up?psis; s[wsi]:0N;      / worst student
  cp:count psis:p[pi]:drop[psis;wsi]; / drop from project
  cu:count usis:u[ui]:drop[usis;wsi]; / drop from supervisor
  ];
 if[cu>uc ui;                         / supervisor over capacity
  wsi:up max up?usis;                 / worst student
  p:@[p;s wsi;drop;wsi]; s[wsi]:0N;   / drop from other project
  cu:count usis:u[ui]:drop[usis;wsi]; / drop from supervisor
  ];
 if[cp=pc pi;(S;):prune[S;up;pi;psis]]; / prune
 if[cu=uc ui;(S;U ui):prune[S;up;where pu=ui;usis]];
 (p;u;s;U;S)}

Student-Allocation Supervisor-Optimal Implementation

  • Kdb+ 4.1 pattern matching is used to unpack the list of parameters used in the iteration as well as to assign the last result of .matching.prune
  • The .matching.nextusp function is used to find the next available supervisor, student and project to match
  • Iterate until either a match is found or no matches are available
  • Iteration passes the supervisor index and increments each time
/ given (p)roject (c)apacity, s(u)pervisor (c)apacity, (p)roject to
/ s(u)pervisor map and (p)roject matches, s(u)pervisor matches, (s)tudent
/ matches, s(U)pervisor preferences and (S)tudent preferences, find next
/ supervisor-optimal match
saua:{[pc;uc;pu;(p;u;s;U;S)]
 ubc:uc>count each u;                          / supervisors below capacity
 pbc:pc>count each p;                          / projects below capacity
 usp:(1=count::) nextusp[pbc;ubc;pu;p;S;U]/ 0; / iterate across supervisors
 if[not count usp;:(p;u;s;U;S)];               / no further matches found
 (ui;si;pi):usp;                               / unpack
 if[not null epi:s si; u:@[u;pu epi;drop;si]; p:@[p;epi;drop;si]]; / drop
 u[ui],:si; p[pi],:si; s[si]:pi;                                   / match
 (;S si):prune[U;S si;();pi];                                      / prune
 (p;u;s;U;S)}
  • Finding the next supervisor’s favorite student’s favorite supervisor’s project is the slowest function
  • The python implementation has a triple-nested for loop and breaks out immediately upon success
/ given (p)rojects (b)elow (c)apacity boolean vector, s(u)pervisors (b)elow
/ (c)apacity vector, (p)roject to s(u)pervisor map, (p)roject matches,
/ (S)tudent preferences, s(U)pervisor preferences and a single s(u)pervisor
/ (i)ndex, return the s(u)pervisor's preferred (s)tudent and their preferred
/ (p)roject (that is mapped to the supervisor) as a triplet (u;s;p). if no
/ match is found, return the next supervisor index ui.  return an empty list
/ if all supervisors have been exhausted.
nextusp:{[pbc;ubc;pu;p;S;U;ui]
 if[ui=count U;:()];                  / no more supervisors
 if[not ubc ui;:ui+1];                / supervisor at capacity
 pis:S sis:U ui;                      / unpack students and their projects
 pis:pis@'where each (pbc&ui=pu) pis; / supervisor's projects with capacity
 pis:pis@'where each not sis (in/:)' p pis;  / not already matched
 if[not count sp:raze sis (,/:)' pis;:ui+1]; / (student;project)
 usp:ui,first sp;                            / (supervisor;student;project)
 usp}

Student-Allocation Setup

q)2#s:2!("JJ",(-2+count first x)#"S";1#",") 0: x:read0 `:students.csv
name   rank| 0  1  2  3  4  5  6  7  8  9  10 11 12 13 14 15 16 17 18 19 20 21 22 23 ..
-----------| ------------------------------------------------------------------------..
190000 3   | G2 P1 V2 S0 A0 O0 L0 D2 K1 V1 R2 G2 Y2 G2 W0 K0 X0 O1                   ..
190001 56  | Q0 P1 P0 M1 N0 P1 T2 N1 I1 K0 P3 X1 F0 P1 S0 C0 Z0 L0 H2                ..
q)2#p:("SJS";1#",") 0: `:projects.csv
code capacity supervisor
------------------------
A0   2        A         
A1   3        A         
q)2#u:("SJ";1#",") 0: `:supervisors.csv
name capacity
-------------
A    3       
B    1       

Student-Allocation Execution

  • Both approaches return supervisor -> student, project -> student and student -> project dictionaries as well as the pruned supervisor and student preference dictionaries
q)d:preprocess[u;p;s]
q)5#first pusUS:.matching.sas . d`PC`UC`PU`U`S
A0| `long$()
A1| 190019 190034
A2| ,190017
B0| `long$()
B1| ,190091
q)5#pusUS 1
A| 190019 190034 190017
B| ,190091
C| 190003 190062 190068 190079 190070
D| 190008 190009 190015 190039 190056
E| 190022 190063
q)5#pusUS 2
190000| G2
190001| Q0
190002| U0
190003| C0
190004| I0

Performance

The key to performance is elegance, not battalions of special cases – Jon Bentley and Doug McIlroy

Code Profiling

  • When the whole implementation is ~15 lines of code, we need a line-profiler
  • Leslie Goldsmith created the qprof line-profiler in 2015

    q)\l prof.q
    q).prof.prof `.matching
    q)\ts:100 eSR:.matching.sm[G;B]
    82 8672
    q).prof.report`
    Name            Line Stmt                           Count Total     Own       Pct   
    ------------------------------------------------------------------------------------
    .matching.prune 3    S:S@[;sis;drop;]/ris;          1400  00:00.024 00:00.018 22.97%
    .matching.sma   9    (eSR Si;eSR[Ri;ri]):prune[eSR  1600  00:00.043 00:00.009 11.83%
    .matching.drop  0    x _ x?y                        5200  00:00.005 00:00.005 6.28% 
    .matching.sm    2    eSR:sma over eSR;              100   00:00.081 00:00.005 6.65% 
    .matching.sma   2    mi:?[;1b]0<count each S w:wher 1700  00:00.005 00:00.005 6.61% 
    .matching.prune 2    (rp;sis):(0;i)cut rp;          1400  00:00.004 00:00.004 5.61% 
    .matching.sma   3    if[mi=count w;:eSR];           1700  00:00.004 00:00.004 4.89% 
    .matching.sma   6    if[not count[e]=ei:e?ri;if[sir 1600  00:00.004 00:00.004 5.47% 
    .matching.sma   8    e[si]:ri;eSR[0]:e;             1600  00:00.004 00:00.004 5.41% 
    .matching.prune 4    (S;rp)                         1400  00:00.003 00:00.003 4.03% 
    .matching.sma   4    rp:R ri:first s:S si:w mi;     1600  00:00.003 00:00.003 4.68% 
    .matching.sma   5    if[count[rp]=sir:rp?si;:.[eSR; 1600  00:00.003 00:00.003 4.73% 
    .matching.sma   10   eSR                            1600  00:00.003 00:00.003 4.49% 
    .matching.prune 0    if[count[rp]<=i:1+max rp?sis;: 1600  00:00.002 00:00.002 2.64% 
    .matching.sma   0    e:eSR 0;S:eSR Si:1;R:eSR Ri:-1 1700  00:00.002 00:00.002 2.49% 
    .matching.sm    0    eSR:(count[sn]#0N;rn?sp;sn?rp) 100   00:00.000 00:00.000 0.37% 
    .matching.sm    3    eSR:(sn;sn;rn)!'(rn;rn;sn)@'eS 100   00:00.000 00:00.000 0.49% 
    .matching.sm    4    eSR                            100   00:00.000 00:00.000 0.27% 
    

Timing Setup

  • Using PyKX we can access both implementations from python
  • Need to increase recursion limit due to copy.deepcopy call

    sys.setrecursionlimit(10000)  # overcome call to copy.deepcopy
    

Timing Result

  • Using the timeit package we can compare the execution times
  • Python implementation is ~10x slower than q (and worsens with increased dimensions)

    >>> assert smq(sd, rd) == smp(sd, rd)  # assert equality
    >>> timeit.timeit('smq(sd,rd)', number=1, globals=globals())
    0.05793206300000975
    >>> timeit.timeit('smp(sd,rd)', number=1, globals=globals())
    0.5015301169999589
    

Q Enhancements

Nothing happens unless first we dream – Carl Sandburg

Well, the the J mentality is that only J is needed and they’re right – Marshall Lochbaum “Naming is Hard” The Array Cast

PyKX Type Handling (SOLVED)!

  • The python lists (not just np arrays) and dictionary keys are promoted to uniform Kdb+ vectors

    >>> kx.q("0N!",{1:[1,2],2:[2,3]})
    1 2!(1 2;2 3)
    pykx.Dictionary(pykx.q('
    1| 1 2
    2| 2 3
    '))
    

Pattern Matching (SOLVED)!

  • Using over and scan requires packing and unpacking complex state for each iteration

    p:pusUS 0;u:pusUS 1;s:pusUS 2;U:pusUS 3;S:pusUS 4;
    
  • Pattern matching makes this much more elegant

    q)(p;u;s;U;S):pusUS;
    

Native assert (SOLVED)!

  • Unit tests require an assert function that throws an exception when the assertion fails
  • We can use Kdb+ 4.1 pattern matching as a lightweight assert
q)(1b):{[a:1;b:2]} ~ {[a:1;b:1]}
'match
[0]  (1b):{[a:1;b:2]} ~ {[a:1;b:1]}
         ^

YAML Support (NOT SOLVED)

  • The hospital-resident problem capacities, hospitals and residents inputs are stored in YAML files
  • KDB Insights configuration is stored in YAML files
  • YAML supports comments
  • YAML supports more types than JSON (booleans, dates, timestamps, null)
q).y.k "\n" sv read0 `:hospitals.yml

Summary