Google Datastore Namespaces
The Google Datastore is one of the Google publicly available scallable No-SQL solution. The other is the BigTable - which can be installed on a swarm of Compute Engine virtual machines. The Datastore is build on top of the BigTable. So with the BigTable you could get the raw performance, but you miss some goodies, like transactions and SQL like query language available with the Datastore.
The Datastore terminology gets me confused. They introduce "Dataset" and "Namespace" - which is not immediately clear, and had me some issues till I figure it out. That is why started this post. From the welcome page of the Datastore project the terminology is defined as
But the Dataset is not defined what it means in the documentation. Through trial and error I found out that the Dataset is the same as your Project ID, which can e found in your Home page of the Google Developer Console:
So in theory by using different Datasets you can access the Datastore data for different project - that is if you have the appropriate permissions. I am at loss why the "Dataset" concept is introduced, as it is exactly the same as the Project and adds confusion. All Datastore API endpoints require you to specify the Dataset, which you must remember is your Project ID.
The "Namespace" is actually a cool way to logically partition your unstructured data. Every Dataset (i.e. project) has a a default empty Namespace. So the reletionship is as follows:
Under the covers the Dataset, the Namesace and the Key is concatenated in the BigTable keys! All Google Cloud applications share one big BigTable for the Entitites and sevaral more for the indexes. That is.. i.e. when you create you new Google Project - you do not get a separate BigTable installed for your project - but rather you can access only the data for your Dataset, i.e. only for your Project.
All the keys for aaaaal your entities have as prefix:
The example creates an Entitiy about a blog in the "BlogPost" Kind and the default (i.e. the empty) Namespace. If we want to insert the same Entity into "Staging" Namespace, then:
Ok, good that we cleared the Dataset/Namespace concept. Any questions ? Let me know.
The Datastore terminology gets me confused. They introduce "Dataset" and "Namespace" - which is not immediately clear, and had me some issues till I figure it out. That is why started this post. From the welcome page of the Datastore project the terminology is defined as
Concept | Datastore | Relational database |
---|---|---|
Category of object | Kind | Table |
One object | Entity | Row |
Individual data for an object | Property | Field |
Unique ID for an object | Key | Primary key |
In this case the projet ID is: bionic-mercury-89314 |
The "Namespace" is actually a cool way to logically partition your unstructured data. Every Dataset (i.e. project) has a a default empty Namespace. So the reletionship is as follows:
All the keys for aaaaal your entities have as prefix:
- The Dataset or as already set 100 times the "project id".
In this way you have your own little portion of the huuuge BigTable where you store your data close to each other. You, and only you have access to this portion. So even though all google customers store their data in one BigTable - there is no messing around with other people's data.
Since BigTable stores the data based on the keys - and since all your entities start with the project id prefix - this means that your data is "dense" - closely together one after another in consecutive BigTable Tablet serves. - Optional namespace.
Usefull for having data for different sources, or for development/testing/staging/production environements, or wherever you wish to namespace your data. - The optional parents and the key itself.
var gcloud = require('gcloud');
// Select the Dataset based on the projectID
// the keyFilename is for authorisatoin
var dataset = gcloud.datastore.dataset({ projectId: 'Your-Project', keyFilename: '/path/to/keyfile.json' }); var blogPostData = { title: 'How to make the perfect homemade pasta', author: 'Andrew Chilton', isDraft: true }; var blogPostKey = dataset.key('BlogPost'); dataset.save({ key: blogPostKey, data: blogPostData }, function(err) { // ... handle the error });
The example creates an Entitiy about a blog in the "BlogPost" Kind and the default (i.e. the empty) Namespace. If we want to insert the same Entity into "Staging" Namespace, then:
var gcloud = var dataset = gcloud.datastore.dataset({ projectId: 'Your-Project', keyFilename: '/path/to/keyfile.json', namespace: 'Staging' });
Ok, good that we cleared the Dataset/Namespace concept. Any questions ? Let me know.
Comments
Post a Comment