RDBMS Column Adapters : encrypting fields

A feature that DataNucleus has had since day 1 has recently been documented. Let’s take the example of having a class Person something like this

public class Person
{
    Long id;
    String name;
    ...
}

By default, when we persist a field of a class to RDBMS you will see an SQL statement like

INSERT INTO PERSON (`NAME`,ID) VALUES (<‘First’>, <1>);

where the <> arguments are JDBC parameters with the associated values shown. Similarly, retrieval of such objects will see an SQL statement like this

SELECT ‘mydomain.model.Person’ AS DN_TYPE, A0.ID, A0.`NAME` FROM PERSON A0

So we select the column that represents the field.

An RDBMS Column Adapter is useful where we want to adapt the value being stored in the database column, for example, to encrypt it.

Assuming we are using MariaDB as our datastore, we can encrypt the name field like this

@PersistenceCapable(detachable="true")
public class Person
{
    @PrimaryKey
    Long id;

    @Extension(vendorName="datanucleus", key="select-function", value="AES_DECRYPT(?, 'MyKey')")
    @Extension(vendorName="datanucleus", key="insert-function", value="AES_ENCRYPT(?, 'MyKey')")
    @Extension(vendorName="datanucleus", key="update-function", value="AES_ENCRYPT(?, 'MyKey')")
    String name;

    ...}

The equivalent annotations for JPA work equally well (in which case the Extension annotation is in package org.datanucleus.api.jpa.annotations).

So we have annotated the field to be encrypted with insert-function/update-function for use when storing the object, and select-function for use when retrieving the object. In this case the persist of the object will invoke the MariaDB function AES_ENCRYPT on the value of the field, and the retrieval will invoke the MariaDB function AES_DECRYPT on the value of the column. You can clearly choose a better encryption key than the one specified, maybe by having it present in the database instance. The SQL statement executed on persist is now

INSERT INTO PERSON (`NAME`,ID) VALUES (AES_ENCRYPT(<‘First’>, ‘MyKey’),<1>)

and on retrieval is

SELECT ‘mydomain.model.Person’ AS DN_TYPE,A0.ID,AES_DECRYPT(A0.`NAME`, ‘MyKey’) FROM PERSON A0

Clearly this idea is not limited to MariaDB, and could be used with PostgreSQL pgp_sym_encrypt/pgp_sym_decrypt for example, and the equivalent on any other RDBMS. Note also, that there are many encryption types available in today’s RDBMS, so do not take this as recommendation of the above function(s), just that you can use the method outlined here to take advantage of them.

 

Enjoy!

Posted in Uncategorized | Leave a comment

DN v5.1 : CDI injected AttributeConverters and event listeners

JPA and JDO API’s introduced the concept of an AttributeConverter which can be used to define how a type is persisted into the datastore, converting from the Java (field) type to a datastore (column) type. All such AttributeConverter classes defined by JPA and JDO are effectively stateless. They are constructed using a default constructor, by the JPA/JDO provider (i.e DataNucleus). DataNucleus v5.1 now makes it possible to utilise CDI to inject properties into an AttributeConverter.

Let’s say we have a type SecretData that we want to store encrypted in the datastore, with an AttributeConverter like this

public class SecretDataConverter implements AttributeConverter<SecretData, String>{

    public String convertToDatabaseColumn (SecretData attribute) 
    {...}

    public SecretData convertToEntityAttribute (String dbData) 
    {...}
}
Note : we do not recommend encrypting data in this way, it is just an example to demonstrate the injection of state into the converter

This is all well and good but we want to configure our encryption process, using a different encryptor based on certain conditions. With standard JPA and JDO we have no way of getting state into these classes. Well let’s use CDI to inject a field, modifying our converter like this

import javax.inject.Inject;
public class SecretDataConverter implements AttributeConverter<SecretData, String>
{
    @Inject
    Encryptor enycryptor;

    ...

}

The only thing we need to do now is CDI enable our JPA or JDO environment.

With DataNucleus JPA v5.1 you would simply define the persistence property javax.persistence.bean.manager specifying the BeanManager instance for CDI (will be standard in JPA 2.2).

With DataNucleus JDO v5.1 you would simply define the persistence property datanucleus.cdi.bean.manager specifying the BeanManager instance for CDI.

When you run your application the AttributeConverter is instantiated and it will have this field injected by CDI. The same idea also applies to JPA event listener classes (i.e just set the persistence property, and follow general CDI documentation for how to inject properties).

Posted in Uncategorized | Leave a comment

DataNucleus performance through the releases

Performance isn’t always the primary motivation when we are developing DataNucleus, but it always remains something that we bear in mind when introducing features or rationalising the APIs. As an interesting comparison of how performance has changed since version 3.x here we present the results of 3 “performance” tests that we have in the DataNucleus test suite.

Test 1 : Persist 2000 simple objects, and then start the timer. Then in a single thread, do the following 200000 times : get a PM, call pm.getObjectById on an id, 5 times per transaction, and close the PM. Stop the timer.

Test 2 : Persist 2000 simple objects, and then start the timer. Then in each of 10 threads, do the following 60000 times : get a PM, call pm.getObjectById on an id, 5 times per transaction, and close the PM. Stop the timer.

Test 3 : Start the timer. Get a PM, and start a transaction. Call pm.makePersistent on an A object, with a List containing 1 B object, and a single relation to a C object. Repeat the operation for 100000 times. Every 10000 objects call flush. Commit the transaction and close the PM. Stop the timer.

v5.1 : Test 1 – 10.0 secs, Test 2 – 10.0 secs, Test 3 – 10.5 secs

v5.0 : Test 1 – 10.0 secs, Test 2 – 10.0 secs, Test 3 – 13.0 secs

v4.1 : Test 1 – 10.0 secs, Test 2 – 10.0 secs, Test 3 – 13.0 secs

v4.0 : Test 1 – 10.0 secs, Test 2 – 10.0 secs, Test 3 – 13.0 secs

v3.2 : Test 1 – 16.5 secs, Test 2 – 20.5 secs, Test 3 – 13.0 secs

In earlier versions performance was less than 3.2, but involve having to go back to earlier JREs so data is not available for comparison. The timings were using an embedded H2 database, on an Intel core i5 with 8Gb RAM, running Linux.

So we have a significant improvement in going to v4.x in terms of allocation of PersistenceManager objects, as well as access to some common properties used by the PersistenceManager. And a further improvement in going to v5.1 in terms of persistence operations.

Clearly this is not a precise science and, as we have said in previous blog entries, a benchmark has to be representative of the operations that your application will be doing for it to mean something to you. If anyone has some simple tests that they want to see in our internal performance tests then you can easily contribute them.

 

Posted in Uncategorized | 6 Comments

DN v5.1 : Meta Annotations

With normal JDO or JPA usage you may have annotated a class like this

@Entity
@DatastoreId
public class Person { ... }

So we have a class that is JPA persistable, and is using the DataNucleus “datastore-identity” extension, providing a surrogate identity column in the datastore. This is fine, and easy enough. But what if you needed to put these 2 annotations on many classes? It would be simpler if you could define your own “composite” annotation that provided them more concisely. This is where we introduce meta-annotations.

If we define our own annotation like this

@Target(TYPE)
@Retention(RUNTIME)
@Entity
@DatastoreId
public @interface DatastoreIdEntity {}

You see that this annotation @DatastoreIdEntity provides both of our normal JPA annotations. So we can now annotate our JPA class like this

@DatastoreIdEntity
public class Person { ... }

Much simpler!

You can do the same thing with JDO annotations, as well as for annotations on fields/methods.

Note that this is new in DataNucleus v5.1, and to use the field/method level JDO/JPA annotations you will have to use updated (javax.) API jars, that will be provided with v5.1

Posted in Uncategorized | Leave a comment

DN v5.1 : Find by unique key

With JDO and JPA APIs you have the ability to find individual objects using their “identity”. This “identity” may be a field value for a single field that defines the primary key, or may be an identity object representing a composite primary key made up of multiple fields.

Some classes have other field(s) that are known to be unique for the particular class (termed in some places as “natural id”, or “unique key”), and so it makes sense to allow the user to find objects using these unique key(s). In the latest release (5.1.0 M2) we provide access to this mechanism. Be aware that this is a vendor extension and so you have to make use of DataNucleus classes (and hence not portable until included in the JDO / JPA specs).

Lets take an example, we have a class representing a driving license. We represent this with an identity that is a unique number. We also have a unique key that is what the driver is provided with.

With JDO the class is

@PersistenceCapable
public class DrivingLicense
{
    @PrimaryKey
    long id;

    String driverName;

    @Unique
    String number;
}

Consequently we can do as follows to get an object using its identity.

DrivingLicense license = pm.getObjectById(DrivingLicense.class, 1);

retrieving the license with id 1. If however we want to use the new retrieval via unique key we do this

JDOPersistenceManager jdopm = (JDOPersistenceManager)pm;
DrivingLicense license = jdopm.getObjectByUnique(DrivingLicense.class, 
    {"number"}, {"ABCD-1234"});

retrieving the license with number set to “ABCD-1234”. See the JDO docs.

 

Using JPA for the same example we have

@Entity
public class DrivingLicense
{
    @Id
    long id;

    String driverName;

    @Column(unique=true)
    String number;
}

To get an object using its identity we do

DrivingLicense license = em.find(DrivingLicense.class, 1);

and to get an object using its unique key we do

JPAEntityManager jpaem = (JPAEntityManager)em;
DrivingLicense license = jpaem.findByUnique(DrivingLicense.class,
    {"number"}, {"ABCD-1234"});

See the JPA docs.

Notes:

  1.  You can have as many unique keys as you want on a class, and this mechanism will support it, unlike with Hibernate “NaturalId” where you can only have 1 per entity.
  2. You use standard JDO/JPA annotations/XML to define the unique key(s), unlike with Hibernate “NaturalId” where you have to use a vendor specific annotation.
  3. If your unique key is made up on multiple fields then you simply specify multiple field name(s) to the second argument in the call, and multiple field value(s) to the third argument in the call.
Posted in Uncategorized | Leave a comment

@Repeatable annotations for JDO and JPA

DataNucleus now provides access to Java8 @Repeatable annotations for use with JDO and JPA. Previously, if you wanted to specify, for example, multiple indexes for a class using annotations, you would have to do it like this (for JDO) using a container annotation

@Indices({
    @Index(name="MYINDEX_1", members={"field1","field2"}), 
    @Index(name="MYINDEX_2", members={"field3"})})
public class Person
{
    ...
}

The JDO 3.2 annotations have now been upgraded (in javax.jdo v3.2.0-m6) to support the Java8 @Repeatable setting meaning that you can now do

@Index(name="MYINDEX_1", members={"field1","field2"})
@Index(name="MYINDEX_2", members={"field3"})
public class Person
{
    ...
}

The same applies to all standard JDO annotations that have a container annotation.

 

For JPA, the same is also now true (though clearly since Oracle seemingly doesn’t care one iota about pushing the JPA spec forward then this is not in the official JPA spec yet). However since DataNucleus provides its own “standard” javax.persistence jar , we have now published version v2.2.0-m1 of this jar adding support for @Repeatable just like with the JDO 3.2 annotations. So any annotation that has container annotation can now be repeated on a class/field/method, you just have to use v2.2.0-m1 of the DataNucleus javax.persistence jar. For example

@Entity
@NamedNativeQuery(name="AllPeople", 
    query="SELECT * FROM PERSON WHERE SURNAME = 'Smith'")
@NamedNativeQuery(name="PeopleCalledJones",
    query="SELECT * FROM PERSON WHERE SURNAME = 'Jones')
public class Person
{
    ...
}

 

Posted in Uncategorized | Leave a comment

DN v5 : Multi-tenancy improvements

JDO and JPA APIs don’t define any support for multi-tenancy, other than where you want to have 1 PMF/EMF per tenant and they have their own database or schema. DataNucleus introduced support for multi-tenancy using the same schema back in v4, whereby the tables that are shared will have an extra discriminator column which specifies the tenant the row applies to, and you specify a persistence property datanucleus.tenantId for the PMF/EMF you are using, defining the tenant it is for. This is fine as far as it goes, but requires that each tenant have their own PMF/EMF. DataNucleus v5 makes this more flexible.

The first change is that you can now specify that same persistence property on the PM/EM (pm.setProperty(…), em.setProperty(…)), so you can now potentially have a PM/EM for each tenant, and the data is separated that way via the tenancy discriminator as in v4. The use-case for this is where you have a web based system and each request has a user, so you create a PM/EM, set the tenant id based on the user, and then each database access will use the appropriate tenant.

PersistenceManage pm1 = pmf.getPersistenceManager();
pm1.setProperty("datanucleus.tenantId", "John");
... // All operations under tenant "John"
pm1.close();

PersistenceManager pm2 = pmf.getPersistenceManager();
pm2.setProperty("datanucleus.tenantId", "Gary");
... // All operations under tenant "Gary"
pm2.close();

The second change is that you can optionally also specify a MultiTenancyProvider, implementing this interface

public interface MultiTenancyProvider
{
     String getTenantId(ExecutionContext ec);
}

and specify the persistence property datanucleus.tenantProvider to point to an instance of your MultiTenancyProvider class. This means that, for example, if you have some session variable that identifies the user and want to share the PM/EM across your users, then you can use this provider instance to define the tenant id for each call. This feature is likely going to be much less useful than the different tenant per PM/EM but it is there for your convenience.

Posted in Uncategorized | Leave a comment

DN v5 : Improved support for Enum persistence

With standard JDO and JPA you can persist a Java enum field as either the name (String-based column) or the ordinal (Number-based column). This is great as far as it goes, so we can easily persist fields using this enum.

public enum Colour{RED, GREEN, BLUE;}

In some situations however you want to configure what is stored. Say, we want to store a numeric value for each enum constant but not the default values (for some reason). In DataNucleus up to and including v4 (for RDBMS only!) we allowed persistence of

public enum Colour
{
    RED(1), 
    GREEN(3), 
    BLUE(5);

    int code;
    private Colour(short val) {this.code = val;}
    public short getCode() {return this.code;}
    public static Colour getEnumByCode(short val)
    {
        case 1: return RED;
        case 3: return GREEN;
        case 5: return BLUE;
    }
}

and then you set metadata for each enum field of this type to specify that you want to persist the “code” value, like this

@Extensions({
    @Extension(vendorName="datanucleus", key="enum-getter-by-value", value="getEnumByCode"),
    @Extension(vendorName="datanucleus", key="enum-value-getter", value="getCode")
   })
Colour colour;

This was fine, except required too much of the user.

 

DataNucleus v5 simplifies it, and you can now do

public enum Colour
{
    RED(1), 
    GREEN(3), 
    BLUE(5);

    int code;
    private Colour(short val) {this.code = val;}
    public short getCode() {return this.code;}
}

and mark each enum field like this

@Extension(vendorName="datanucleus", key="enum-value-getter", value="getCode")
Colour colour;

We don’t stop there however, because the value that is persisted using this extended method can now be either short, int or String. It is also now available for use on RDBMS, Cassandra, MongoDB, HBase, Neo4j, ODF, Excel, and JSON, so your code is more portable too!

Posted in Uncategorized | Leave a comment

DN v5 : Support for Java8 “Optional”

Java 8 introduces a class (java.util.Optional) that represents either a value or null. DataNucleus v5 has a requirement of Java 8, and adds support for persisting and querying fields of this type. Let’s suppose we have a class like this

public class MyClass
{
    private Optional<String> description;
    ...
}

By default DataNucleus will persist this as if it was of the generic type of the Optional (String in this example). Consequently on an RDBMS we will have a column in the datastore of type VARCHAR(255) NULL. If the field description represents a null then it will be persisted as NULL, otherwise as the value contained in the Optional. It supports the generic type being a normal basic type (though not a collection, map, array), or a persistable object.

JDOQL in JDO 3.2 adds support for querying the Optional field, like this

Query q = pm.newQuery(
    "SELECT FROM mydomain.MyClass WHERE description.isPresent()");

So this will return all instances of the class where the description field represents a value. Similarly we can return the value represented by the Optional field, like this

Query q = pm.newQuery(
    "SELECT description.get() FROM mydomain.MyClass");

As you can see, JDOQL makes use of standard Java method namings for accessing the Optional field.

For JPA querying, you can simply refer to the Optional field as if it was of the generic type represented.

This is part of the JDO3.2 spec and you can make use of it in DataNucleus v5+. It is not currently part of the JPA2.1 spec, but you still can make use of it in DataNucleus v5+ when using JPA.

Posted in Uncategorized | Leave a comment

DN v5 : JPA with “nondurable” identity

With standard JPA you only have the ability to have “application” identity for an entity (i.e field(s) marked as being part of the primary-key). Some time ago we added vendor extension support for having a surrogate (datastore) identity for an entity with DataNucleus. DataNucleus v5 now adds a further vendor extension whereby you can have an entity that has no identity. This is useful where you maybe have a class that represents a log entry and you don’t have need to access a specific object (other than searching for all entries with particular field value(s)). You would do this as follows

@Entity
@org.datanucleus.api.jpa.annotations.NonDurableId
public class LogEntry
{
    String level;
    String category;
    String message;

    ...
}

Note that you can specify this in orm.xml also, see the DataNucleus documentation. So we have made use of a DataNucleus-specific annotation, and now we can persist objects of this type and they will not be assigned an identity. We now handle objects of this type in the normal way for querying, just that we cannot use em.find(…).

This is not part of the JPA 2.1 spec, and is not likely to be included in a JPA spec any time soon, but you can make use of it in DataNucleus v5+.

Posted in Uncategorized | Leave a comment