Now open for early access

Your data isn't clean.
Let's fix that —
without sharing it.

DataSousChef helps you prepare messy administrative data for analysis. You describe your data. We generate a cleaning script. You run it on your computer. High data quality, high data security.

See how it works

3 days free · No credit card required to start · Your data never leaves your machine

80%
of analysis time
is data preparation

Data cleaning is the hardest part.
It shouldn't also be the most stressful.

If you work with administrative data — student records, service data, research datasets — you know the pain. Messy formats, hidden errors, columns that should match but don't. And the data is often too sensitive to share with any tool.

DataSousChef solves this by keeping your data on your machine — always. You describe what your data looks like. We do the technical thinking.

Three things. Done properly.

Select what you need. Answer a few plain-language questions. Get a Python script ready to run.

Diagnose & Standardise

Find hidden problems in a single dataset and make its formats consistent. You don't need to know the technical name for a problem — just recognise it.

Extra spaces in column headings
"F", "f", "Female" all meaning the same thing
Dates stored as text instead of dates
Missing values coded as 999 or "unknown"

Cross-Column Checks

For datasets where columns relate to each other. Check that values are logically consistent and flag anything unusual.

Age and date of birth must agree
End date can't be earlier than start date
Totals that should add up across columns
Unusual ages or dates that stand out

Link Datasets Together

Join two or more datasets. The tool helps you decide what the unit of analysis is and which identifiers to use — even if the formats don't quite match.

Match students across an enrolment and outcome file
Identify records that appear in one file but not the other
Check for duplicate IDs before joining
Normalise ID formats before matching
🔒 Privacy first

Your data never leaves
your computer.

Administrative data — student records, patient data, service data — carries strong protection requirements. DataSousChef is built around one unbreakable rule: no dataset, no sample, no single value is ever uploaded.

You describe. We generate. You answer questions about your data in plain language. Nothing else is shared.
Run the script yourself. You download a Python script and run it on your own machine. The data stays put.
Compliant by design. Built for organisations working under UK GDPR and similar data protection requirements.
You describe your data Column names, formats, rules — in plain English
Descriptions travel to our system No data values — only your descriptions
A Python script comes back Custom-built for your exact dataset
You run it locally Your data never moves. Ever.

Three steps. No coding required.

You don't need to know Python to use DataSousChef. We handle the technical part.

1. Describe your data

Answer short questions about your dataset — what columns it has, what they contain, what a correct value looks like. No technical knowledge needed.

5–15 minutes

2. Get your script

DataSousChef generates a Python cleaning script tailored exactly to your dataset. Review the plain-language summary, then download it.

Under 30 seconds

3. Run it on your machine

Open the script on your computer. It reads your file, cleans it, prints a report showing exactly what changed, and saves the cleaned version.

Data stays local

Built for people who know their data — not their code.

If you recognise any of these situations, DataSousChef was made for you.

Researchers & Academics

You've been given an administrative dataset to analyse but it's come from a database with inconsistent formats and coded values you need to interpret first.

"I know what the data should look like — I just can't write the code to fix it."

Analysts & Evaluators

You're evaluating a programme or service and you have multiple data extracts that should link together — but the identifiers aren't consistent across files.

"I need to join these two files but one uses student IDs and the other uses usernames."

HE & Public Sector Professionals

You work with sensitive data under GDPR — student records, HR data, service data — and you can't upload it to any external tool.

"Every tool I find wants me to upload the data. I'm not allowed to do that."

✨ Early Access Open

Try DataSousChef free
for 3 days.

No credit card needed to start. Set up your account in under two minutes and run your first data cleaning script today.

3 days fully free
No credit card to start
Your data never leaves your computer
Cancel anytime