Yᴏᴜʀ Pʀᴏᴅᴜᴄᴛ ʜᴇʀᴇ!
Add a link to your company or project here: purchase a GitHub sponsorship.
The logic of the world is prior to all truth and falsehood.
— Ludwig Wittgenstein[1]
A curated
list of falsehoods programmers believe in. A falsehood is an idea that you initially believed was true, but in reality, it is proven to be false.
E.g. of an idea: valid email address exactly has one @
character. So, you will use this rule to implement your email-field validation logic. Right? Wrong! The reality is: emails can have multiple @
chars. Therefore your implementation should allow this. The initial idea is a falsehood you believed in.
The falsehood articles listed below will have a comprehensive list of those false-beliefs that you should be aware of, to help you become a better programmer.
Contents
Arts
Business
-
Falsehoods about Online Shopping - Covers prices, currencies and inventory.
-
Falsehoods about Prices - Covers currencies, amounts and localization.
-
Falsehoods about IBANs
- International Bank Account Numbers are not international.
-
Falsehoods about Economics - Economics are not simple or rational.
-
Decimal Point Error in Etsy’s Accounting System - The importance of types in accounting software: missing the decimal point ends up with 100x over-charges.
-
Twenty five thousand dollars of funny money - Same error as above at Google Ads, or the danger of separating your pennies from your dollars, where $250 internal coupons turned into $25,000. My advice: get rid of integers and floats for monetary values. Use decimals. Or fallback to strings and parse them, don’t validate.
-
Characters
<
and >
in company names lead to XSS attacks - Because UK allows companies to be registered with special characters, a hacker leveraged them to register \"><SCRIPT SRC=MJT.XSS.HT></SCRIPT> LTD
, but also ; DROP TABLE "COMPANIES";-- LTD
, BETTS & TWINE LTD
and SAFDASD & SFSAF \' SFDAASF\" LTD
.
-
Minutiae of company names - How the rules of the State of Delaware and the IRS does not intersects.
-
CLDR currency definitions
- Currency validity date ranges overlap due to revolts, invasions, new constitutions, and slow planned adoption.
-
tax
- A PHP 5.4+ tax management library.
Cryptocurrency
Dates and Time
-
Falsehoods about Time - Seminal article on dates and time.
-
More Falsehoods about Time - Part. 2 of the article above.
-
Falsehoods about Time and Time Zones - Another takes on time-related falsehoods, with an emphasis on time zones.
-
Critique of Falsehoods about Time - Takes on the first article above and provides an explanation of each falsehood, with more context and external resources.
-
Falsehoods about Unix Time - Mind the leap second!
-
Falsehoods about Time Zones - Has some nice points regarding the edge-cases of DST transitions.
-
Your Calendrical Fallacy Is Thinking… - List covering intercalation and cultural influence, made by a community of iOS and macOS developers.
-
Time Zone Database - Code and data that represent the history of local time for many representative locations around the globe.
-
The Long, Painful History of Time - Most of the idiosyncrasies in timekeeping can find an explanation in history.
-
You Advocate a Calendar Reform - Your idea will not work. This article tells you why.
-
So You Want to Abolish Time Zones - Abolishing timezones may sound like a good idea, but there are quite a few complications that make it not quite so.
-
The Problem with Time & Timezones - A video about why you should never, ever deal with timezones if you can help it.
-
$26,000 Overcollection by Labor Department - The consequence of wrong calendar accounting.
-
RFC-3339 vs ISO-8601 - An giant list of formats from the two standards, how they overlaps, and live examples.
-
ISO-8601,
YYYY
, yyyy
, and why your year may be wrong - String formatting of date is hard.
-
UTC is Enough for everyone, right? - There are edge cases about dates and time (specifically UTC) that you probably haven’t thought of.
-
Storing UTC is not a silver bullet - “Just store dates in UTC” is not always the right approach.
-
How to choose between UT1, TAI and UTC - Depends on your priorities between SI seconds, earth rotation sync, leap seconds avoidance.
-
Why is subtracting these two times (in 1927) giving a strange result? - Infamous Stack Overflow answer about both complicated historical timezones, and how historical dates can be re-interpreted by newer versions of software.
-
Critical and Significant Dates - From Y2K to the overflow of 32-bit seconds from Unix epoch, a list of special date to watch for depending on the system.
- “I’m going to a commune in Vermont and will deal with no unit of time shorter than a season.” - Is the note left on his terminal by a quitting engineer in the 70s, after too much effort toiling away on sub-second timing concerns. Source: The Soul of a New Machine.
Education
Emails
Geography
Human Identity
Internationalization
On character encoding, string formatting, unicode and internationalization.
-
Falsehoods about Language - Translating a software from English is not as straightforward as it seems to be.
-
Falsehoods about Plain Text - Plain text can’t cut it, which makes Unicode even more incredible for its ability to just work well.
-
Falsehoods about text - A subset of the falsehoods from above, illustrated with some examples.
-
Internationalis(z)ing Code - A video about things you need to keep in mind when internationalizing your code.
-
Minimum to Know About Unicode and Character Sets - A good introduction to unicode, its historical context and origins, followed by an overview of its inner working.
-
Awesome Unicode
- A curated list of delightful Unicode tidbits, packages and resources.
-
Dark corners of Unicode - Unicode is extensive, here be dragons.
-
Let’s Stop Ascribing Meaning to Code Points - Dives deeper in Unicode and dispels myths about code points.
-
Unicode misconceptions - A collection of falsehoods on case, encodings, string length, and more.
-
Breaking Our
Latin-1
Assumptions - Most programmers spend so much time with Latin-1
they forgets about other’s scripts quirks.
-
Ode to a shipping label - Character encoding is hard, more so when each broken layer of data input adds its own spice.
-
Localization Failure: Temperature is Hard - You cannot localize temperature differences as-is.
-
i18n Testing Data
- Compilation of real-word international and diverse name data for unit testing and QA.
-
Big List of Naughty Strings
- A huge corpus of strings which have a high probability of causing issues when used as user-input data. A must have set of practical edge-cases to test your software against.
Management
-
Falsehoods about Video - Cover it all: video decoding and playback, files, image scaling, color spaces and conversion, displays and subtitles.
-
Horrible edge cases to consider when dealing with music - Music catalogs data are full of crazy stuff.
-
MusicBrainz database schema - An open-source project and database that seems to have solved the complexity of music catalog management.
-
DDEX - The industry standard for music metadata, including archiving, sound recording, sales and usage reporting, royalties and license deals.
-
Apple Music Style Guide - Quality insurance guidelines to format music, art, and metadata to increase discoverability.
Networks
Phone Numbers
Postal Addresses
-
Falsehoods about Addresses - Covers streets, postal codes, buildings, cities and countries.
-
Falsehoods about Residence - It’s not only about the address itself, but the relationship between a person and its residence.
-
Letter Delivered Despite No Name, No Address - Ultimate falsehood about postal addresses: you do not need one.
-
UK Address Oddities - Quirks extracted from a list of most residential property sales in England and Wales since 1995.
-
What is the Most Minimal UK Address Possible? - The trick is to rely on postcodes, which in the UK are pretty specific and “often identify one or a few specific buildings, unlike countries where a postcode represents an entire neighbourhood”.
-
The Bear with Its Own ZIP Code - Smokey Bear has his own ZIP Code (
20252
) because he gets so much mail.
-
Why doesn’t Costa Rica use real addresses? - Costa Rican uses an idiosyncratic system of addresses that relies on landmarks, history and quite a bit of guesswork.
-
Regex and Postal Addresses - Why regular expressions and street addresses do not mix.
-
Parsing the Infamous Japanese Postal CSV - “I saw many horrors, but I’ve never seen this particular formatting choice anywhere else.”
-
USPS Postal Addressing Standards - Describes both standardized address formats and content.
-
libaddressinput
- Google’s common C++ and Java library for parsing, formatting, and validating international postal addresses.
-
addressing
- A PHP 5.4+ addressing library, powered by Google’s dataset.
-
postal-address
- Python module to parse, normalize and render postal addresses.
-
address
- Go library to validate and format addresses using Google’s dataset.
Science
Society
Software Engineering
-
Falsehoods about Versions
- Attributing an identity to a software release might be harder than thought.
-
Falsehoods about Build Systems - Building software is hard. Building software that builds software is harder.
-
Falsehoods about Undefined Behavior - Invoking undefined behavior can cause anything to happen, for a much broader definition of “anything” than one might think.
-
Falsehoods about CSVs - While RFC4180 to exists, it is far from definitive and goes largely ignored.
-
Falsehoods about Package Managers - Covers package and their managers.
-
Falsehoods about Testing - An attempt to establish a list of falsehoods about testing.
-
Falsehoods about Search - Why search (including analysis, tokenization, highlighting) is deceptively complex.
-
What every software engineer should know about search - A better sourced article on the difficulty of implementing search engines.
-
Falsehoods about Pagination - Why your pagination algorithm is giving someone (possibly you) a headache.
-
Falsehoods about garbage collection - Misconceptions about the predictability and performance of garbage collection.
-
Myths about File Paths - Diversity of file-systems and OSes makes file paths a little harder than we might think of.
-
The weird world of Windows file paths - “On any Unix-derived system, a path is an admirably simple thing: if it starts with a
/
, it’s a path. Not so on Windows.”
-
Myths about CPU Caches - Misconceptions about caches often lead to false assertions, especially when it comes to concurrency and race conditions.
-
Myths about
/dev/urandom
- There are a few things about /dev/urandom
and /dev/random
that are repeated again and again. Still they are false.
-
Facts about State Machines
- State machines are often misunderstood and under-applied.
-
Hi! My name is… - This talk could have been named falsehoods about usernames (and other identifiers).
-
Popular misconceptions about
mtime
- Part of a post on why file’s mtime
comparison could be considered harmful.
-
Rules for Autocomplete - Not falsehoods per se, but still a great list of good practices to implement autocompletion.
-
Floating Point Math - “Your language isn’t broken, it’s doing floating point math. (…) This is why, more often than not,
0.1 + 0.2 != 0.3
.”
-
The yaml document from hell - YAML is full of obscure complexity like accidental numbers and non-string keys.
-
I am endlessly fascinated with content tagging systems - There are edge-cases even in tagging systems which are supposed to be barebone.
-
Falsehoods about Quantum Technology
- Common misconceptions about quantum technology and computers.
-
Falsehoods about Event-Driven Systems - Misconceptions about event driven systems and message passing.
Transportation
Typography
Video Games
-
The Door Problem - All the things you have not considered implementing for your doors in games.
Web
Contributing
Your contributions are always welcome! Please take a look at the contribution guidelines
first.
This list gathered some popularity in social medias over the past few years. See it being discussed and mentioned elsewhere
.
The header image
is based on a modified photo taken in February 2010 by Iza Bella, distributed under a Creative Commons BY-SA 2.0 UK license.
[1]: Notebooks, 1914-1916 (Liveright, 2022) - source: page 14e. [↑]