Seed data
This is a preview feature. We would love your feedback (opens in a new tab)!
The need for generated data
Generated data, or seed data, is a useful tool in software development, providing an initial set of data for your database.
Here's why it's beneficial:
- Development: Provides a consistent dataset for developers.
- Testing: Ensures predictability when verifying features.
- Demonstration: Showcases your software's abilities with example content.
- Default Data: Adds necessary built-ins, like country lists.
- Onboarding: Gives new users a filled-out starting point.
- Performance: Helps simulate heavy usage scenarios.
- Guides: Common reference in tutorials.
While generated data might be the unsung hero of early development, it can get a bit needy as software evolves. But don't sweat it! Our mission is to save you from the never-ending saga of maintaining those seed scripts. Who has time for that, right?
Introducing @snaplet/seed
The key to effortless generated data is a tool that deeply understands your database's schema. By introspecting your database, we are able to create a fully-typed client dedicated to data generation.
Let's see how it works.
Generating data with @snaplet/seed
You can refer to the Seed Quick Start Guide to learn about the basics of @snaplet/seed
.
Inside the workflow
Deterministic data generation
The data generated by @snaplet/seed
is fully deterministic, if you run the same plan twice with the same inputs, you will always get the same data as output.
Our default data generation functions are based on copycat (opens in a new tab).
The operators are injecting their own seed for each plan they wrap to ensure you won't have data duplication. For example, these 2 users will have different values:
await snaplet.$merge([ snaplet.users([{}]), snaplet.users([{}]),]);
You can also use the seed
option to specify a custom seed for a plan:
await snaplet.users([{ id: ({ seed }) => seed, email: 'alice@acme.com'}], { seed: 'hello' });await snaplet.users([{ id: ({ seed }) => seed, email: 'bob@acme.com'}], { seed: 'world' });
Plans, stores and SQL statements
The plan is what's returned by the methods of @snaplet/seed
. It's a representation of the data you want to seed.
const userPlan = snaplet.users([{ email: 'snappy@snaplet.dev' }]);// ^ the plan ^ the plan's inputs
When a plan is executed, triggered by await
ing the plan, it stores the generated data in an in-memory object called a store.
Each plan has its own store.
Then, the store is turned into SQL statements that are executed to persist the data in your database.
Given the plan from the Seed Quick Start Guide:
Here is what its store looks like internally:
const store = { users: [user, user, user, user], posts: [post], comments: [comment, comment, comment],}
This store is very useful when you want to connect your models, reusing existing data instead of generating new one.
Static or dynamic data, we got you covered
If you want to use static data, you can directly an array to the models methods. As always, all the required fields and relationships will be automatically created for you based on your database schema.
// an array of 2 userssnaplet.users([ { email: 'alice@acme.com' }, { email: 'bob@acme.com' },]);
If you want to use dynamic data, you can pass a callback function to the models methods. We inject a function x
that you can use to generate as many models as you want.
// an array of 10 userssnaplet.users((x) => x(10));// an array of 10 users with a custom emailsnaplet.users((x) => x(10, (index) => ({ email: `user${index}@snaplet.dev`,})));
And if you need both static and dynamic data, you can mix them together!
// an array of 10 userssnaplet.users((x) => [ { email: 'first@acme.com' }, ...x(8), { email: 'last@acme.com' },])
Connecting data
Using connect
callback function
Let's start from our previous plan.
We specify that the author of each comment will be provided from an external source rather than being generated using the connect
callback function.
We can use the injected store
to provide the user
model we want to connect to. In this case the only generated user
in the plan will be the author of the post.
Let's start from our previous plan.
We specify that the author of each comment will be provided from an external source rather than being generated using the connect
callback function.
We can use the injected store
to provide the user
model we want to connect to. In this case the only generated user
in the plan will be the author of the post.
graph
If your plan is complex, it can be quite challenging to find the right model to connect to. That's why we also provide the graph
variable to the connect
function.
The graph
contains all the data that was generated as part of the plan, arranged to follow the shape of your plan.
And here is how you can use it:
Here is what the graph
looks like for this plan:
// as we started with snaplet.posts, the graph is an array of postsconst graph = [ { author: user, comments: [ comment, comment, comment, ], },];
branch
Now let's complexify our plan a bit:
- We want to generate 3 posts.
- The author of each post should also be the author of the post's comments.
It seems challenging to find the right user to connect to using store
or graph
. That's why we also provide the branch
variable to the connect
function.
The branch
is a particular iteration of the graph
that was generated. It matches the path to the connect
function in which it is injected.
Let's take a look at the branch
for our previous plan, adapted to our new requirements:
Generating 3 posts will result in 3 branches, each containing a post, its author and its comments.
In our above example, the connect
function will receive the branch corresponding to its iteration. So the first connect
function will receive the first branch, the second connect
function will receive the second branch, and so on.
Here is what the branch
looks like for this particular connect
function:
// as we started with snaplet.posts, the branch is a post modelconst branch = { author: user, comments: [comment],};
Using autoConnect
option
We provide a special option that you can activate in the options of a plan, called autoConnect
.
When true
, the plan will automatically connect models relationships to fulfill to the store if possible. The corresponding model will be picked randomly (but deterministically, we're using copycat.oneOf
method under the hood).
In the following example, the post
model will be connected to one of the 3 users
in the store.
Augmenting external data with $createStore
Sometimes, you want to augment external data with generated data. For example, you might want to generate 10 posts
for a particular user
in your database.
For this purpose, we provide the $createStore
utility function. It allows you to create a store and pass it as a plan's option:
import { PrismaClient } from "@prisma/client"import { SnapletClient } from '@snaplet/seed';const prisma = new PrismaClient();const snaplet = new SnapletClient()const user = await prisma.users.findUnique({ where: { email: 'alice@acme.com' } });const store = snaplet.$createStore({ users: [user]});// The 10 posts will be connected to the unique user in the storeawait snaplet.posts((x) => x(10), { autoConnect: true, store });
When using $createStore
, the data are marked as external, and won't be persisted in the database. Only the generated data will be persisted.
You can use the external
option if you want the data to be persisted:
// the initial data is not marked as external and will be persistedconst store = snaplet.$createStore({ users: [user]}, { external: false });// we add another user to the store, this time marked as external.// You can omit the external option, it defaults to true.store.add('users', anotherUser, { external: true });
Manipulating stores
We saw that a plan is persisting the generated data in a store, and the store will be turned into SQL statements.
Sometimes, it's easier to break down your plan into multiple plans, and then merge them together. That's why we provide the $pipe
and $merge
functions.
Using $pipe
The $pipe
operator allows you to chain multiple plans together, injecting the store
of the previous plan into the next plan.
Using $merge
The $merge
operator allows you to merge multiple plans together, without injecting the store
of the previous plan into the next plan.
All stores stay independent and are merged together once all the plans are generated.
Customizing @snaplet/seed
with aliases
The models, fields and relationships names are fully customizable. You can use the alias
option to change them.
You need to regenerate your types after changing the alias option.
To do that, run this command in your terminal:
Inflection
Inflection generally refers to the modification of words to express different grammatical categories. In our context, inflection is about altering the form of a model, field or relationship name to fit its intended use or to adhere to certain conventions so your code is more readable.
You can use the inflection
option to provide your own rules to define the names of your models, fields and relationships.
We provide a default and sensible implementation of the inflection rules, inspired by PostGraphile (opens in a new tab), but you can override them if you want.
These rules are:
- Model names: pluralized and camelCased.
- Scalar field names: camelCased.
- Parent field names (one to one relationships): singularized and camelCased.
- Child field names (one to many relationships): pluralized and camelCased.
- We also support prefix extraction and opposite baseName for foreign keys (opens in a new tab).
Given your tables being named User
, Post
and Comment
, here is an example of how you can use the inflection
option to change the names of your models:
And here is how you would change the inflection rules:
Override
If you are not satisfied with the default names of your models, fields or relationships, but you only want to do one-off changes, you can use the override
option.
The override
option is applied on top of the inflection
option, it's perfect for one-off changes.
Regenerating @snaplet/seed
assets
@snaplet/seed
provides you with a client library that corresponds to your database structure. To do this, snaplet generates looks at your database structure and uses this to generates assets that @snaplet/seed
can use to provide this client library.
Whenever your database structure changes, @snaplet/seed
will need these assets to be regenerated so that they are in sync with your new database structure.
If you are familiar with @prisma/client
(opens in a new tab), you can think of snaplet generate
doing the same thing, just in this case for @snaplet/seed
rather than @prisma/client
.
You can regenerate these assets by running:
npx snaplet generate