The Art & Science Behind Schema Design

Since joining Structify, I’ve spent (read: all of) my time thinking about data—specifically, how to make inherently unstructured information sensical both to humans and to AI agents.
This process is part creativity, part precision. Or as I like to think of it: art meets science.
🎨 The Art: Creative Schema Design
One of the most powerful aspects of structifying (soon-to-be-official term for using Structify) your data is that you get to decide how it’s shaped. You’re not just working with a fixed format—you’re designing the schema itself: the tables, the properties, and the relationships between them.
It’s not just about capturing data, it’s about how you model it.
Case Study: Clinical Trials Schema
We recently tackled a use case involving clinical trial data—specifically, structuring outcome data like graphs, charts, and results across multiple dosage groups. Our first attempt? A single table with all possible outcomes crammed in as properties.
Spoiler: it didn’t work.
After some whiteboarding and rethinking, we split it into a cleaner, more modular schema. The result made more sense to users and to the AI agents processing the data. You can see what we came up with here.
Schema Design Starter Questions
If you’re building a schema, start here:
- What are the smallest pieces of info you care about? → These are your entities (rows).
- What distinguishes one from another? → These are your properties (columns).
- How are different entities related? → These are your relationships.
Designing schemas is a creative act. The structure you create can shape how data is understood, queried, and displayed. That flexibility is powerful—but it also means the choices you make matter.
🧪 The Science: Precision Matters
Of course, creativity without clarity leads to chaos. That’s where precision comes in.
When designing a schema, especially one intended for machine consumption, precision is non-negotiable. You need to define elements clearly and avoid ambiguity that might trip up your model—or your users.
Best Practices for Precise Schema Design
- Be unambiguous. Use clear definitions.
- Avoid property overload. Don’t cram multiple concepts into one field.
- Anticipate edge cases. Define behaviors explicitly.
- Test with real examples. What looks good on paper may not work in practice.
Example: The “Name” Property
Let’s say you're defining a Name
field in a Person table. It’s tempting to write:
“The name of the person.”
But what does that mean? First name only? Last name? Both? Nickname?
Here’s a better definition:
“A person’s full name, including (when available) first, middle, and last name—but always AT LEAST a last name.”
This small change gives your model better guidance and your data much more consistency. And just wait until you try modeling titles—trust me, it's trickier than it looks.
Final Thoughts
Schema design isn’t just a technical task—it’s an opportunity to shape how data is understood and used. Get it right, and you unlock new ways to search, analyze, and generate insights from your content.
If you’re curious about how this works in action, check out Structify and dive into more examples like this dataset on restaurant groups, restaurants, and menus.
Thanks for reading! If you're structuring complex data, thinking through a schema, or just curious about how AI understands information, I'd love to hear your thoughts. Reach out to me at gabe@structify.ai.
-Gabriel Broome, Data Lead