JPA, Eclipselink and storing hierarchies

Yesterday we worked on the new storage facility for the Epos 0.9 release. The new release will be based on EclipseLink and its JPA implementation. The new Tree store will not be String based (as in 0.8), but provide a better representation for trees using nested sets. There we discovered a problem with tree nodes and cascading their removal. We basically provide two entities to store trees. The TreeEntity and a Node entity. The theory behind that is quiet simple. Basically the Tree hoilds a reference to the root Node (curently it holds references to all Nodes in the Tree) and a Node has a reference to its parent node and to all children.

Here is the basic structure for the TreeEntity


@Entity
public class TreeEntity {
    @OneToMany(cascade={CascadeType.ALL})
    private List nodes;
...
}

and for the nodes


@Entity
public class Node{
    @ManyToOne()    
    private Node parent;
    @OneToMany(cascade={CascadeType.ALL},mappedBy="parent")
    private List children;
...
}

The interesting (and problematic) part is the cascading. When we delete a tree, we want to delete all nodes and when deleting a node, we want all children to be deleted as well. And that works only in theory 🙂 The Problem we discovered seems to be related to a Bug in EclipseLink, but I am not 100% sure about that. What I am sure about is that it is related to the delete order. We have a demo tree with just a root and one child. Successfully deleting the tree results in the following log entry:

--The remove operation has been performed on: test.Tree[id=1]
--The remove operation has been performed on: test.Node[id=2]
--The remove operation has been performed on: test.Node[id=3]
--begin unit of work commit
--begin transaction
--Execute query DeleteObjectQuery(test.Node[id=3])
--DELETE FROM NODE WHERE (ID = ?)
	bind => [3]
--Execute query DeleteObjectQuery(test.Node[id=2])
--DELETE FROM NODE WHERE (ID = ?)
	bind => [2]
--Execute query DeleteObjectQuery(test.Tree[id=1])
--DELETE FROM TREE WHERE (ID = ?)
	bind => [1]
--commit transaction
--end unit of work commit

First, the child node with ID 3 gets deleted, than the root with ID 2 and finally the tree object with ID 1. Note that we didn’t invoke the Node deletion explicitly. We just removed the tree instance. The same call fails in some situations with this log:

--The remove operation has been performed on: test.Tree[id=1]
--The remove operation has been performed on: test.Node[id=2]
--The remove operation has been performed on: test.Node[id=3]
--begin unit of work commit
--begin transaction
--Execute query DeleteObjectQuery(test.Node[id=2])
--DELETE FROM NODE WHERE (ID = ?)
	bind => [2]
--VALUES(1)
--Local Exception Stack: 
Exception [EclipseLink-4002] (Eclipse Persistence Services - 1.0.2 (Build 20081024)): org.eclipse.persistence.exceptions.DatabaseException
Internal Exception: java.sql.SQLException: DELETE on table 'NODE' caused a violation of foreign key constraint 'FK_NODE_PARENT_ID' for key (2).  The statement has been rolled back.
Error Code: 20000
Call: DELETE FROM NODE WHERE (ID = ?)
	bind => [2]
Query: DeleteObjectQuery(test.Node[id=2])
	at org.eclipse.persistence.exceptions.DatabaseException.sqlException(DatabaseException.java:313)

The first thing that happens after calling commit on the transaction is a delete call to the root node with ID 2. That results in the foreign key vialoation because the child node still exists and references the root in its parent field. I was only able to reproduce that behavior on Derby DB in embedded mode – and it does not happen always. Strange !
To solve the problem, we followed the advice in the bug report and added a @PrivateOwned annotation to the Node’s child list while disabling the remove cascade


...
@OneToMany(cascade={CascadeType.MERGE, CascadeType.REFRESH, CascadeType.PERSIST}, mappedBy="parent")
@PrivateOwned
private List children;
...

That did the trick. I still hope that the problem gets fixed in EclipseLink and we can go back to default JPA spec instead of relying on a EclipseLink specific annotation. Just a note, of course we could have done the node deletion manually and take care of the proper order, but that would have added some entity specific deletion code and using @PrivateOwned we can stick to a simple delete call in the Epos workspace, which keeps things simple when using the persistence API.