Big Data technologies and in particular selection algorithms attract more and more criticism from libertarian advocates. I echoed growing concerns myself in an article calling for more transparent algorithms.
French newspaper Liberation revealed that National Education ministry faces legal pressure to give access to the code behind the APB algorithm. The latter is used by universities to select freshmen. The variables used by this algorithm remained secret until a student association, helped by lawyer Jean Merlet-Bonnan, put officials under pressure and eventually got a 200-page instruction leaflet explaining users how to use the selection tool.
The variables used in the selection algorithm
Written in technical style, the leaflet reveals the main criteria are used to select candidates :
- area of origin (the area being freely customizable)
- whether or not the student is in “reorientation” (which means that the student has changed field of study because of insufficient results)
- admission criteria imposed for certain courses (Universities can define exclusion criteria for certain courses)
The admission process is then followed up using list generated on freely clickable criteria like country of origin, age, third first name, …
A perfect example of biased “algorithm”
I doubt one can give the name of “algorithm” to a tool that can be customized by the user to achieve his or her needs. In my opinion an algorithm must be able to produce “responses” that are homogenous in the way they are conceived. No need either to comment on selection criteria like “country of origin”.
In the present case, the tool used by the French Public Education is apparently ill-conceived. It’s designed to give the appearance of equality while still ensuring users can tweak the system to achieve the desired outcome. Rather than an equal process, those lines of code promote discretionary power to select those candidates that one University wants to get and reject the other ones.
Liberation gives one stunning example. The area of origin can be freely changed by the university to include or more or less large area, and can even be reduced to a selection of high-schools. There is no better definition of “selection bias”.
This example pretty much sums up all the worst aspects of ill-conceived and wrongly applied algorithmic tools. An opaque algorithm is developed and made available to users who can freely modify parameters to achieve undocumented goals.
Everything is wrong in this process and that’s why more transparency and ethics is needed when it comes to Big Data and algorithms which influence the lives of thousands.