A.3 Artificial Intelligence - in the context of IDPV Systems
Topic area lead: James Monaghan
Background material
Articles & explainers
Videos
[1hr Talk] Intro to Large Language Models
Very thorough and clear explainer of how LLMs work
Explained: The conspiracy to make AI seem harder than it is! By Gustav Söderström
Goes through now neural networks work, and how they’re used in LLMs, image generators, etc
Courses & tutorials
What should we talk about
Background
We don’t need yet another “introduction to AI” - people can read that elsewhere
Perhaps a simple reminder that “AI” used to be called “machine learning” and before that it was called “data science” and before that it was called “statistics”
At its core, it is about pattern recognition and extrapolation / prediction
Why is this a hot topic now? Availability of compute, data and research has caused a massive acceleration (but not a whole timeline)
Probably need to introduce 2 major innovations that have a direct bearing on the subject:
Transformer models - neural networks that learn context, trained on very large data sets (“foundation models”) - leading to many new applications in NLP (e.g. ChatGPT)
Consider adding background on what a neural network is too
Generative Adversarial Networks - pairing of two neural networks (a generator and a discriminator) - creates very realistic new content (including deepfakes)
The goal is to enable consumers of this information to have sufficient context to understand how the different modalities work and in turn how they apply to IDV
Consider the output as diagrams / infographics rather than defaulting to a whitepaper format for this material
Relevant modalities
Explanation of what they are, how they work, how they’re used in IDPV today
Do we need to explain the current state of the art, or will that be handled in the other topic areas about defences and attacks?
How are these different to how a human verifies identity?
Computer vision
Optical character recognition
Extracting text from documents (to read information from them)
Object detection
Recognising features on documents (to determine authenticity)
Biometrics (face, fingerprint, palm, voice??)
Matching unique features of a subject against enrolled individuals
Some open source face verification tools which include explanations of how they work:
A leaderboard of face verification models
Just shows accuracy - would be great to augment with robustness
Models available via the DeepFace (PyPi | GitHub) library (with % score against Labeled Faces in the Wild dataset):
VGG-Face (97.78%)
Google FaceNet (99.63%)
OpenFace (93.8%)
Facebook DeepFace (97.35%)
DeepID (99.15%)
Dlib (99.38%)
ArcFace (99.40%)
Models available via FaceTorch:
MagFace+UNPG
AdaFace
Also includes a deepfake detector that might be worth looking into?
Liveness detection
Some background information from FaceTec
Pattern / anomaly detection
Behavioural analysis
Risk scoring
Natural language processing
Language translation
Disambiguating and “fuzzy matching” (against data sources)
Sentiment analysis??
GANs & Diffusion Models
Creation of realistic image/video/audio
Deepfakes
Where might this go
Speculation about how things might evolve and whether that could lead to new impacts on the IDPV sector
Consider teeing this up here, but the detail should be in the "Attacks" section
Liaise with Heather on future scenario development
Leave the impression that the only certainty is change going forward
Who should we get to contribute?