Recently I posted about GIS and why I love it, and I hope one of the major takeaways is that GIS is a field based on mapping abstractions onto the real world and a great way to quickly discover the complexities and difficulties inherent in doing so. This post is a lot wider than GIS, but we’ll use it as a leaping off point.
Coastline Paradox
One of my favourite examples of why mapping abstractions to reality is Hardâ„¢ is the coastline paradox1. You can test this yourself: look at a map of the world in a kid’s geography textbook and measure the coastline of Western Australia with some string you’ll arrive at an answer. But if you take a bigger, more detailed map of WA from your local suburban chippy (the culturally required one with all the local varieties of fish on it) and do the same you’ll get a larger answer as your string has to weave in and out of promontories and inlets that weren’t visible at the lower resolution. You can keep doing this with more detailed maps and find the longer your coastline becomes and now you see how this helped inspire Benoit B. Mandelbrot’s work in the field of fractal geometry (protip: the “B.” in his name stands for “Benoit B. Mandelbrot”).
In fact, the only way to precisely and accurately measure the exact coastline of WA (or some other, lesser state) would be to have a map that is equivalent to the size and detail of Australia itself down to every grain of sand. Even then, it would only be accurate for the exact period of time it represented. Wherein we find the root of the problem of mapping abstractions to reality:
The perfect abstraction that entirely encapsulates a real object would be no different from the object itself.
Based upon our discoveries in my previous post this leads to another important lesson for anyone building systems:
Every system that represents real objects in abstract must make trade-offs between accuracy and efficiency and it is of critical importance that these tradeoffs are identified and understood.
Medical Records Are Not Medical Health
Now let’s step away from GIS for a bit with another example: A medical record is not an accurate representation of the health of an individual.
A medical record is the the notes of medical professionals who have their own inherent biases, limited time with patients, and unsafe working hours. Doctors are still just people and can make mistakes, they’re not superhuman paragons of healing. These factors will affect the accuracy and precision of medical records which can be compounded by later visits to health professionals who take what is written as truth rather than listen to a patient. General practitioners will miss stuff that specialists would catch but a patient can’t see a specialisst without a referral from the GP that refuses to acknowledge the need for a referral.
Now if we look at a medical record an abstraction designed to reference to a patient’s medical health, we can see that it does a piss poor job. Sure, it’s great at picking up a lot of acute issues like a broken bone after a car accident2 or providing advice on how to cure scurvy after being a bellend. But you also see all the ways in which it is inaccurate, fallible, and error-prone; especially for people with unusual, compound, or chronic health issues.
My friend WearyBonnie raised an excellent point. Even an accurate medical diagnosis or condition is an abstraction! It doesn’t inform about how it will manifest specifically in someone but provide guidelines on likely symptoms and treatments. You have to work within the reality of the patient experience for any effective progress. Worse still, diagnostic and treatment criterion fall down completely in settings where someone has multiple conditions - the interrelations between them can be so complex (and rarely studied) that your in such personal territory. It helps highlight the real rather than the abstraction and failure to recognise this can cause so much suffering and damage to a patient.
Digital Twins In Vehicles
So we get the idea of why abstractions are not reality, let’s move on to a growing area of VC-bait: “digital twins”. A lot of people hailing this sort of stuff as the future of even consumer cars really tend to ignore the realities.
- Does it factor in the hidden stresses in the metals due to the age of the vehicle?
- How does it capture that the refining or manufacturing process for any component is not 100% perfect and you might end up with the early-mortality rates in a bathtub curve?
- Does your simulated environment accurately match the type of bitumen and frequency of potholes it will impact every day?
- Does it accurately capture driver fatigue and distraction in the ability to operate the vehicle?
- Does it capture the actual road environment it will be operated on with pedestrians, cyclists, bollards, failed traffic lights, and other distracted drivers when factoring in the safety to drive?
The rigourous attention to safety in the design, production, and operation of aircraft is part of the reasons why it’s safer to fly than to drive. While digital modelling is used to assess the safety and efficiency of aircraft in design and use, it doesn’t prevent the need for manual checks before and after every flight, maintenance every so many flight hours, and constant pilot awareness. It’s one of the major areas where digital twins are used but still respect the difference between the abstraction and the reality. Heck, even electronic sensors can fail or report erronous results and pilots trusting them rather than reality can cause horrific crashes.
GIS Question: If you have an autonomous truck and you monitor it via a digital twin, has your underlying roadmap accounted for contintental drift since the WGS84 standard (about 2.7m in Australia) or is it barrelling down the footpath heading for a lightpole?
Abstractions Are Great For Simple Cases
I’m not saying abstractions aren’t useful! If your system doesn’t need that level of detail and provides a strongly specified contract of what it does handle and what it doesn’t handle then you’re probably fine. A good example of this is the warning on first load of MyFireWatch which tells you exactly what the system can do and can’t do.
- It doesn’t supplant emergency services warnings.
- It is only updated when it receives new data and specifies those approximate times.
- It can’t see through clouds.
- It has accuracy/precision limits.
Of course, there is nothing preventing some yob taking screencaps and putting them online to show potential threats to life that aren’t actually real. A better contract would probably stipulate terms of use defining the limits of republishing and requiring the inclusion of itself on such republishings.
Abstractions Fall Down In Edge Cases
There’s a bunch of great articles about falsehoods programmers believe about various things and these are all examples of us creating simple abstractions to represent real world systems that we don’t properly understand. Below is an incomplete list of some of my favourite articles about falsehoods I have encountered in the wild:
- Falsehoods programmers believe about names.
- Callendrical fallacies.
- Falsehoods programmers believe about time zones.
- Falsehoods programmers believe about online shopping.
- Falsehoods programmers believe about email.
Of course, we can’t assume that that we properly understand falsehoods either, but they’re a wonderful starting point for people - especially people who aren’t from marginalised or minority backgrounds - to learn about edge cases that will completely fuck up their systems; probably not for them, but for end users. And sure this might not be so bad if you’re just making a website to rank who is hot on campus but if you’re trying to create a global social network that supplants existing social contracts and you force at-risk groups to use their real names because your abstraction of what a person or social network is doesn’t match the realities of life you’re actually creating a huge fucking risk of which you will never face the personal consequences.
Conclusion
I get this article is mostly a series of loosely connected examples and snarky quips but I hope the core point comes through: when we record information or details we need to understand that this is only a representation and not the reality. It’s an abstraction we are using to represent something that exists (or sometimes things that don’t exist but we pretend they do[1],[2]).
When you first start designing a system, you should take note of what cases it needs to cover and what cases you will never cover. This will help you when you go to pick the abstractions you use, but you must still clearly document the limits and tolerances of them, with reference to your system designs. You also need to understand that there are bad actors and real world factors that don’t mash up to whatever glorious ideals you have for your system. You need to spend resources investing in red-teaming your system; not just the cyber-security of it but the potential (mis)use of it. You need marginalised voices and expert voices at the table who are empowered to bring what they foresee as potential issues. Doing this you can avoid so many of the pitfalls I’ve discussed above and also not have me show up banging on your door saying “come outside, I just want to ask why your signup form said first names must be at least three characters long”.
2023-09-12 Edit: Added new medical update based on comments from WearyBonnie.
-
Not to be confused with the coastline pair’o’docs, the salt-encrusted boots left on the shore by our most famous Prime Minister and sacrifice to Poseidon. ↩︎
-
Unless they decide to x-ray the wrong limb despite your painful protestations. Source: me, circa 2016. ↩︎