Diagnose & Standardise
Find hidden problems in a single dataset and make its formats consistent. You don't need to know the technical name for a problem — just recognise it.
DataSousChef helps you prepare messy administrative data for analysis. You describe your data. We generate a cleaning script. You run it on your computer. High data quality, high data security.
3 days free · No credit card required to start · Your data never leaves your machine
If you work with administrative data — student records, service data, research datasets — you know the pain. Messy formats, hidden errors, columns that should match but don't. And the data is often too sensitive to share with any tool.
DataSousChef solves this by keeping your data on your machine — always. You describe what your data looks like. We do the technical thinking.
Select what you need. Answer a few plain-language questions. Get a Python script ready to run.
Find hidden problems in a single dataset and make its formats consistent. You don't need to know the technical name for a problem — just recognise it.
For datasets where columns relate to each other. Check that values are logically consistent and flag anything unusual.
Join two or more datasets. The tool helps you decide what the unit of analysis is and which identifiers to use — even if the formats don't quite match.
Administrative data — student records, patient data, service data — carries strong protection requirements. DataSousChef is built around one unbreakable rule: no dataset, no sample, no single value is ever uploaded.
You don't need to know Python to use DataSousChef. We handle the technical part.
Answer short questions about your dataset — what columns it has, what they contain, what a correct value looks like. No technical knowledge needed.
5–15 minutesDataSousChef generates a Python cleaning script tailored exactly to your dataset. Review the plain-language summary, then download it.
Under 30 secondsOpen the script on your computer. It reads your file, cleans it, prints a report showing exactly what changed, and saves the cleaned version.
Data stays localIf you recognise any of these situations, DataSousChef was made for you.
You've been given an administrative dataset to analyse but it's come from a database with inconsistent formats and coded values you need to interpret first.
"I know what the data should look like — I just can't write the code to fix it."
You're evaluating a programme or service and you have multiple data extracts that should link together — but the identifiers aren't consistent across files.
"I need to join these two files but one uses student IDs and the other uses usernames."
You work with sensitive data under GDPR — student records, HR data, service data — and you can't upload it to any external tool.
"Every tool I find wants me to upload the data. I'm not allowed to do that."
No credit card needed to start. Set up your account in under two minutes and run your first data cleaning script today.