The biggest fears and reasons for automated document checks

By Dr. Henri Bouma, TNO, The Netherlands

14th January 2021

During the Christmas holiday, I was reflecting on my work in the D4FLY project and asking myself the ‘why’-question: Why am I contributing to the development of new Artificial Intelligence (AI) techniques that can be used to enhance the authentication of travel, identity or breeder documents. And I noticed that there are reasons to develop these techniques, but there are also arguments to be very careful. In this blog, I will summarize the main reasons to desire them, and I will also describe my biggest fears.

Why document analysis?

The first category of reasons show how document analysis can assist in the prevention of (cross-border) crime.

1.      Terrorism: As a society, we want to prevent terrorists traveling to our country. Especially, if these people are bringing explosives, firearms, or chemical, biological, radiological or nuclear weapons. Document analysis may assist in the identification of terrorists.

2.      Smuggling of goods: We also want to prevent the smuggling of certain goods, for example because the goods are stolen (e.g., vehicles), because our government doesn’t want to miss out on the tax income (e.g., cigarettes, alcohol), or because the goods ruin the lives of our youngsters (e.g., drugs, cigarettes). Document analysis may assist in the identification of smugglers.

3.      Illegal migration: Some people are leaving their country of origin and they are trying to illegally cross borders purely for economic reasons in order to seek material improvements in their livelihoods. There are legal ways to facilitate migration for economic purposes, e.g., by granting a visa or working permit. However illegal migration is undesired because it creates negative effects for both the countries of origin and destination. Document analysis may assist in the identification of illegal immigrants.

4.      Human trafficking: Human trafficking is involuntary migration and this is often related to sexual abuse or forced labor. According to the UNODC, it is a grave violation of human rights that is usually combined with threat or use of force, coercion, abduction, fraud, deception, abuse of power or vulnerability, or giving payments or benefits to a person in control of the victim. Document analysis may assist in the identification of traffickers or their victims.

5.      Illegal sales: Identity documents can also help to prevent illegal sales of goods or services. For example, we don’t want to permit the sale of alcoholic drinks to young people and identity documents can be used to verify age. We don’t want to permit the sale of guns to mentally unstable people, and identity documents can be used to see if a person has a history of mental illness. We don’t want banks to be involved in money laundering, and identity documents can be used to guarantee a confirmed relation between a bank account and a person.

6.      Identity fraud: Identity fraud is a crime by itself. It is related to all of the previous reasons, but goes even further. If criminals (in the physical world or the cyber world) take the identity of someone else, they can receive money and transfer it to their own account. Vice versa, they can spend money, enjoy the benefits of an expensive purchase without the burden of a huge debt. They can also use stolen identities to apply for government benefits causing significant financial loss to the public sector. Document analysis may assist in the prevention of identity fraud.

The second category of reasons show how document analysis can enable passengers and citizens to freely travel and consume.

7.      Mobility: One of the goals of border control is to facilitate legal movement of people (for business and pleasure), and provide access to a country with minimal obtrusive checks. Analysis of travel documents permits citizens to cross the borders, although I hope that people will use their freedom wisely and minimize negative effects of travel for the environment.   

8.      Support refugees: Refugees are leaving their country because it is not safe for them to stay, for example due to war, hunger, or religious/racial/political persecution. We want to help the refugees, by feeding the hungry and clothing the naked, lodging the stranger, and giving them an opportunity to start a new life in a free and safe environment. Document analysis may assist in the distinction between real refugees and illegal immigrants.

9.      Provide goods and services: There are many things that we want to do, but only for the right people. We want to sell alcohol to adults, but not to juveniles. We want to make guns available for policemen and soldiers, but not for everybody. We want to provide an extra allowance, but only for the poor. Document analysis may assist in the facilitation of these services.

Why AI-based automation and digitization?

The third category of reasons are related to the advantages of automation and digitization.

10.   Faster checks: Automation may result in faster document checks and thereby make border crossings faster and less obtrusive. This may have a positive economic effect for business travelers.

11.   More consistent: Border guards may differ in experience and expertise, and it is difficult to be alert every minute of a working day. Automation may result in more reliable and consistent decisions.

12.   Remote workflow: The border guards have different levels of expertise. Some are more trained in document authentication, some in questioning or profiling, and other in facial comparison. The best document experts cannot be everywhere at the same time, and in a normal workflow it is common to let a lesser expert make suboptimal decisions. A digital system may facilitate the rapid transfer of the most challenging documents to the expert to make a final decision within a few seconds or minutes.  

13.   Focus on challenge: The support of the system may help the document experts to minimize time on the easy documents (clearly genuine or fraudulent), and create time to invest in the most challenging documents and deepen the knowledge.

14.   Large scale: The burden on border guards and immigration services is enormous. The number of documents that must be checked is large and it takes time to find and educate good document experts. Automation enables the scale-up and allows an organization of limited size to analyze a large collection of documents.

Why not AI-based automation?

So there are many reasons to desire an automated identity/travel/breeder document analysis system, and it seems a perfect solution. But now we get to my biggest fears.

·        Big-brother: We don’t want a society that constantly monitors its citizens, neither in a transparent nor in a concealed way.

·        Losing control: We don’t want a society that prohibits you to enter a country, just because the AI-algorithm analyzed your document and concluded that the traffic light should indicate ‘red’. We may go on the slippery slope towards AI making non-transparent decisions. Or we might fall into the trap of relying on these AI systems too much and get ‘lazy/blind’ whereas human control remains essential.

·        Unfair decisions: We don’t want to employ a system that is biased e.g., against ethnic minorities or women because the algorithm was mainly trained on documents from white males.

·        Data breach: Systems that collect, transfer or process data have the risk of an undesired data breach, which could expose the personal information to an unauthorized person.

·        Not being used: So, in one way, my fear is that the automated document analysis will be used in a wrong way. Yet, my biggest fear is the opposite: that it will not be used, that the crime continues and that we will not benefit from the mentioned advantages. Nevertheless, we must take the above-mentioned concerns seriously. Some suggestions for doing so are the following.

I hope we will benefit from the mentioned advantages and that we will take the appropriate actions to ensure the responsible and ethical use of the technology. This means that the system must be transparent and consistent and the operators should be aware of biases and vulnerabilities. In general, we can learn from the privacy design strategies (from H.J. Hoepman in 2020), ethical principles (Biometrics Institute in 2019) and the ethics guidelines for trustworthy AI (from the AI HLEG in 2019). Hereby a few points that are tailored to the automated document analysis:

        Explainable & Assisting: The most important point is that we should avoid fully automated decisions that have negative impact. The document-analysis system should support humans in their decisions. The system can speed-up the process in finding a relevant reference document or indicate a suspicious region of the document. This computer-assistance should function in a way that is understandable for a human, so that it can enrich the final decision of the human and reduce the chance that something is overlooked. In some cases, it is not possible to obtain a perfectly unbiased dataset for the development of a system (e.g., the origin of travelers and immigrants is not equally spread over all countries). This is another reason to maintain human oversight to reach a fair decision.  

        Minimize & Protect: Data should be well protected with appropriate (cyber) security measures to avoid data breaches. However, the best protection is data reduction or minimization, because you cannot lose what you do not have. Data can be minimized in several ways. For example, do not store data of all passengers, but only those cases that are really necessary. And don’t store it forever, but only for the time that it is really necessary. For the assessment of document security features, the personal data is often irrelevant. Therefore, when possible, personal data should be removed with anonymization techniques. But also personal data can be reduced, e.g. from a complete document number to the to document series or from a full date of birth to an age (range) or from nationality to target group (e.g. EU / non-EU). The minimization is closely related to the right to be forgotten and it is one of the best strategies to protect the personal data.

        Accountable: It should be possible to verify who accessed the personal data to detect and thereby avoid misuse.  

In summary, there are many good reasons to desire automated document analysis, but we must also take ethical guidelines into account, so that it does not become one of my biggest fears.