How your automation score is calculated.

Most automation estimates look at one thing: whether a task involves a computer. We look at three separate layers, because a task can look fully digital on the surface while being deeply protected underneath.

Layer 1 — Formal I/O
What goes in and what comes out? Digital input + digital output means the task is technically accessible to an AI agent. This is the only layer most analyses use — and why they systematically overestimate automation risk.
Layer 2 — Contextual Substrate
What does the task actually depend on? Even a "digital" task can rely entirely on 15 years of pattern recognition, unwritten team norms, or information that only exists in someone's head. An AI agent has no access to this substrate.
Layer 3 — Social Function
Does doing the task also accomplish something relational? A client call is never just information transfer — it is the relationship. A performance review is never just feedback — it signals who has authority. Automating the output doesn't preserve the function.

A task is only fully automatable when all three layers are either digital or irrelevant. That's a much smaller surface than a single-layer analysis suggests.

The two numbers you see.

Your results show two scores, not one. Here's what they each mean:

Naive exposure
The theoretical ceiling.
What percentage of your tasks are digital-in, digital-out — technically within reach of an AI agent, ignoring everything else. This is the number most analysts report.
Effective exposure
The realistic floor.
After subtracting protection from contextual substrate and social function. This is our estimate of what AI could actually replace today, given the full picture of how the work is done.

The gap between the two numbers is your human advantage — the portion of your role that AI can theoretically access but cannot practically replace.

The four task types.

Each task in your profile is classified by whether its input and output are digital (D) or analogue (A). This gives four quadrants:

D→D
On-screen work
Highest AI exposure. Both input and output are digital — writing, analysis, code, data processing.
A→D
In-person + screen output
Partly protected. Requires reading a room, hearing tone of voice, or being physically present — then producing something digital.
D→A
Screen input + physical output
Requires physical execution. An AI can plan the action but not perform it.
A→A
Fully in-person
Strongest human advantage. The task begins and ends in the physical or relational world.

Your score is a weighted average across your task mix, adjusted by the contextual and social layers. Two people with the same job title can get meaningfully different scores depending on how they actually spend their time.

Why some professions show synthetic data — and what that means.

When this project launched, the dataset was empty. And an empty dataset is a broken product: there's no peer comparison, no profession explorer, no sense of what the distribution looks like for any role.

We solved this the same way recommendation systems, marketplaces, and survey platforms do: by seeding the dataset with synthetic-but-structurally-valid profiles while real data builds up.

Synthetic profiles are generated by the same AI pipeline that processes real assessments. They use realistic task decompositions for each profession, plausible time distributions, and the same three-layer classification model. They are explicitly flagged in the database as synthetic — there is no commingling with real data in any analysis or export.

What we do and don't do with synthetic data:

FLAGGED IN DB
Every synthetic record has is_synthetic = true in the database. This flag cannot be changed and is present from the moment the record is created.
VISIBLE ON SITE
When you see a profession page or dashboard entry, the "real / total" count is shown. You can always see how many real assessments are behind a number.
EXCLUDED FROM RESEARCH
Any data export, research partnership, or published analysis uses real-only data. Synthetic profiles are never included in outputs that leave this site.
BEING REPLACED
As real assessments accumulate for a profession, synthetic profiles are phased out. Once a profession reaches enough real data, the scaffolding comes down.

Why professions need at least 10 assessments before appearing in the explorer.

The profession explorer and peer comparison features only display professions that have 10 or more total assessments on file. This threshold exists for two reasons:

Statistical stability. A single person's score for "Financial Analyst" tells you almost nothing about Financial Analysts in general. Ten gives you a distribution. Below that, the averages are too noisy to be meaningful.

Re-identification risk. If only one person has ever done an assessment as a "Chief Diversity Officer at a mid-size logistics firm," showing their data in a public explorer would effectively identify them — even without a name. Ten is the floor we've set for the data to be anonymous in aggregate.

If you complete an assessment for a profession that hasn't crossed this threshold yet, your data is still saved and still counts — it just won't appear publicly until the profession has enough profiles for the numbers to be meaningful.

What we collect. What we don't. How to delete it.

This system was designed so that identifying you is structurally impossible — not just against the rules. Here's exactly what exists in the database after you complete an assessment:

JOB TITLE
Stored in normalized form (e.g. "software engineer"). This is the only quasi-identifying field. It's used to group profiles for peer comparison — nothing else.
TASK LIST
The tasks you described and their time distributions. Stored as anonymized JSON. Not linked to any name, employer, or identity.
SCORES
Your naive and effective automation exposure scores, and the quadrant distribution of your tasks.
WHAT WE DON'T STORE
No name. No employer. No IP address. No device fingerprint. No behavioural analytics. No advertising cookies.

To delete your data: email info@automatable.me and tell us what you want deleted. No archive, no backup retention, no "we'll keep it 30 days." Gone.

Where this project is going.

The goal is to build the dataset that doesn't exist yet: bottom-up, task-level, worker-reported automation data at scale. Every top-down forecast you've read was produced by economists classifying jobs from the outside. This is the first attempt to collect it from the inside.

Here's the plan as it stands:

Open dataset
Once the real-data volume is large enough to be statistically meaningful, we plan to publish an anonymized, aggregated version of the dataset for researchers. Not individual profiles — profession-level distributions. Real-data only.
Research partnerships
The data is relevant to AI companies trying to understand their real addressable market, policymakers designing workforce programs, and universities building future-of-work curricula. Partnerships will only ever use anonymized, aggregated data — never individual records.
Synthetic data sunset
As real profiles accumulate, synthetic data is phased out profession by profession. The long-term goal is a dataset that is entirely real — synthetic profiles are scaffolding, not a permanent feature.
Sponsored tools
When your results identify high-risk tasks, relevant AI tools will be surfaced alongside them — matched by keyword to what you actually do. Sponsorships help cover the cost of running this project and will always be clearly labelled. Until then, this project runs on private savings. If you find it useful and want to help keep it going, contributions via PayPal are genuinely appreciated.

If you have questions about the methodology, the data, or a potential research collaboration, reach out at info@automatable.me.